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Abstract 

The  problem  of  estimating  boundary  and  initial  conditions  for  a  regional  open- 
ocean  model  is  addressed  here.  With  the  objective  of  mimicking  the  SYNOP  experi¬ 
ment  in  the  Gulf  Stream  system,  a  meandering  jet  is  modeled  by  the  fully  nonlinear 
barotropic  vorticity  equation.  Simulated  observations  of  current  velocity  are  taken 
using  current  meters  and  acoustic  tomography.  Twin  experiments  are  performed  in 
which  the  adjoint  method  is  used  to  reconstruct  the  flow  field.  The  estimated  flow  is 
forced  to  resemble  the  true  flow  by  minimizing  a  cost  function  with  respect  to  some 
control  variables.  At  the  minimum,  the  error  covariance  matrix  of  the  estimated 
control  variables  can  be  evaluated  from  the  inverse  Hessian  of  the  cost  function. 

The  first  major  scientific  objective  of  this  work  is  the  estimation  of  initial 
and  boundary  conditions  for  the  model  from  sparse  interior  data.  First  the  vorticity 
initial  conditions  are  used  as  control  variables  and  the  boundary  conditions  are  kept 
fixed.  The  jet-like  flow  is  found  to  induce  strong  dependence  of  the  model/data 
misfit  upon  the  specified  boundary  conditions.  Successively,  the  boundary  values  of 
streamfunction  and  vorticity  are  included  among  the  control  variables  and  estimated 
together  with  the  initial  conditions.  Experiments  are  performed  with  various  choices 
of  scaling  and  first  guess  for  the  control  variables,  and  various  observational  strategies. 
The  first  major  new  result  obtained  is  the  successful  estimation  of  the  complete 
set  of  initial  and  boundary  conditions,  necessary  to  integrate  the  vorticity  equation 
forward  in  time.  From  a  time-invariant  first  guess  for  the  boundary  conditions,  the 
assimilation  is  able  to  create  temporal  variations  at  the  boundaries  that  make  the 
interior  flow  match  well  the  velocity  observations,  even  when  noise  is  added.  It  is 
found  that  information  from  the  observations  is  communicated  to  the  boundaries 
by  advection  of  vorticity  and  by  the  instantaneous  domain— wide  connections  in  the 
streamfunction  field  due  to  the  elliptic  character  of  the  Poisson  equation. 

The  second  major  scientific  objective  is  the  estimation  of  error  covariances  in 
the  presence  of  strongly  nonlinear  dynamics.  The  evaluation  of  the  full  error  covari¬ 
ance  matrix  for  the  estimated  control  variables  from  the  inverse  Hessian  matrix  is 
investigated  along  with  its  dependence  upon  the  degree  of  nonlinearity  in  the  dy- 
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namics.  Further  major  new  results  here  obtained  are  the  availability  of  off-diagonal 
covariances,  the  successful  calculation  of  error  covariances  for  all  boundary  and  ini¬ 
tial  conditions,  and  the  estimation  of  errors  for  the  interior  fields  of  streamfunction 
and  vorticity.  The  role  of  the  Hessian  matrix  is  assessed  in  gauging  the  sensitivity 
of  the  estimated  boundary  and  initial  conditions  to  the  data.  Also,  the  importance 
is  stressed  of  retaining  off-diagonal  structure  of  the  Hessian  to  obtain  more  accurate 
error  estimates. 

Thesis  Supervisor;  Paola  Malanotte-Rizzoli, 

Professor  of  Physical  Oceanography 

Department  of  Earth,  Atmospheric  and  Planetary  Sciences 
Massachusetts  Institute  of  Technology 
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Introduction 


During  1988  and  1989,  a  moored  array  of  current  meters  and  acoustic  tomographic 
transceivers  were  deployed  at  39°N,  55°W  in  the  northwest  Atlantic  Ocean.  This  array 
was  part  of  the  SYNOP  (Synoptic  Ocean  Prediction)  experiment  and  the  purpose 
of  the  deployment  was  to  provide  time  series  of  velocity,  temperature  and  acoustic 
travel  times,  as  a  foundation  for  modeling  and  assimilation  studies.  A  motivation 
for  these  studies  was  to  construct  the  observed  spatial  and  temporal  behavior  of  a 
strong  western  boundary  current,  ie.  the  the  Gulf  Stream,  to  study  the  dynamics  of 
the  system. 

Of  particular  interest  is  the  high  eddy  kinetic  energy  found  along  the  Gulf 
Stream  axis,  extending  considerable  distances  into  the  interior  (see  maps  in  Schmitz, 
1976,  and  Wyrtki,  1976).  Some  theoretical  and  modelling  studies.  Gill  et  al.  (1974) 
and  Holland  &:  Rhines  (1980),  indicate  that  the  southern  recirculation  is  a  region 
of  baroclinic  instability  and  could  be  a  source  of  eddy  energy.  Bryden  (1982)  and 
Hogg  (1985)  looked  for  evidence  of  this  baroclinic  instability  from  observations  taken 
in  the  southern  recirculation.  However  the  sparsity  of  the  observations  (LDE  and 
POLYMODE)  in  space  made  it  difficult  to  measure  the  growth  rates  and  vertical 
phase  propagation  associated  with  baroclinic  instability.  Using  objective  analysis 
and  quasigeostrophy,  Hua  et  al.  (1986),  extrapolated  data  from  the  LDE  experiment 
to  produce  three-dimensional  maps  of  streamfunction  and  potential  vorticity,  but 
the  mapping  requires  a  priori  spatial  covariances  to  be  supplied.  Furthermore  the 
temporal  behavior  of  mapped  fields  is  difficult  to  estimate  from  data  alone.  All  of  these 
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studies  point  to  the  desirability  of  extrapolating  information  from  the  observations 
in  space  and  time  using  dynamical  constraints. 

An  alternative  source  of  eddy  activity  in  the  southern  recirculation  region  is  the 
radiation  of  planetary  waves  by  the  Gulf  Stream.  Pedlosky  (1977),  Malanotte-Rizzoli 
(1984)  and  Hogg  (1988)  present  theoretical  studies  of  the  processes  by  which  the  Gulf 
Stream  could  radiate  planetary  waves,  and  numerical  modelling  experiments  were 
performed  by  Malanotte-Rizzoli  et  al.  (1987,  1995)  to  investigate  these  processes. 
The  Eliassen-Palm  relation  for  time-mean  flows,  which  arises  from  the  conservation 
of  enstrophy  (Plumb,  1986),  predicts  that  the  radiated  waves  can  interact  with  the 
mean  flow  in  the  southern  recirculation  to  produce  eddies.  Chester  et  al.  (1994)  was 
able  to  measure  the  radiative  component  of  the  Eliassen-Palm  relation  using  data 
from  the  SYNOP  East  array  at  55° W.  Knowing  the  relative  sizes  of  terms  in  the 
Eliassen-Palm  relation  will  determine  whether  the  region  is  a  source  or  sink  of  eddy 
enstrophy,  and  whether  wave  radiation  is  significant. 

Unfortunately  both  physical  processes,  baroclinic  instability  and  wave  radia¬ 
tion,  require  estimates  of  eddy  potential  vorticity  fluxes  over  long  periods  of  time 
which  are  difficult  to  make  from  observations.  However  potential  vorticity  conser¬ 
vation  may  be  used  as  a  dynamical  interpolater  to  extract  as  much  information  as 
possible  from  the  available  data  and  to  construct  spatially  and  temporally  varying 
fields  of  potential  vorticity.  The  assimilation  of  temperature  and  velocity  observations 
from  the  SYNOP  East  array,  into  a  dynamical  model  of  the  ocean  in  the  region  of  the 
southern  recirculation,  is  necessary  in  order  to  perform  such  a  dynamical  interpolation 
and  to  examine  the  physical  processes. 

Thus,  a  prime  motivation  for  model/data  assimilations  is  the  potential  to  con¬ 
struct  the  behavior  of  a  dynamical  system  in  regions  of  space  and  time  that  are  devoid 
of  data.  The  assimilation  allows  the  construction  of  fields  of  important  dynamical 
quantities,  such  as  potential  vorticity,  which  could  not  otherwise  be  directly  observed 
in  the  real  ocean.  An  assimilation  may  also  supply  error  estimates  to  give  a  level  of 
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confidence  for  the  estimated  fields.  An  attractive  feature  of  the  SYNOP  data  set  for 
assimilation  studies  is  that  motions  due  to  the  Gulf  Stream  have  high  signal  to  noise 
ratio  compared  to  other,  more  quiescent,  parts  of  the  ocean.  Meanders  and  rings 
provide  coherent  signals  in  the  data,  which  a  dynamical  model  can  reproduce. 

The  prospect  of  assimilating  the  SYNOP  East  data  into  an  ocean  model  im¬ 
mediately  raises  two  important  issues.  First,  the  large  horizontal  velocity  gradients 
observed  in  the  Gulf  Stream  and  its  recirculations  imply  that  fully  nonlinear  dynamics 
are  needed  to  adequately  model  the  flow.  Second,  the  SYNOP  East  observing  array 
covers  a  rectangular  region  approximately  200  km  by  600  km,  straddling  the  Gulf 
Stream  at  55°W.  To  successfully  constrain  an  ocean  model  using  this  dataset  alone, 
the  spatial  extent  of  the  model  domain  cannot  be  too  great,  otherwise  large  regions 
of  the  model  will  not  be  influenced  by  the  data.  Thus,  a  basin-scale  model  may  not 
be  the  most  useful  and  the  model  domain  may  be  chosen  so  as  its  boundaries  are  in 
the  open  ocean.  These  two  issues  have  important  implications  for  the  model/data 
assimilation. 

The  issues  of  assimilation  in  the  presence  of  strong  nonlinearity  and  the  im¬ 
plementation  of  open  boundaries  are  of  fundamental  importance  to  modeling  and 
assimilation  studies.  They  are  challenging  issues  that  need  to  be  fully  investigated 
and  understood.  In  this  study,  model/data  assimilations  are  performed  using  sim¬ 
ulated  velocity  observations  with  a  simple  nonlinear  dynamical  model  (barotropic 
conservation  of  vorticity).  With  simulated  data  the  true  flow  field  is  known,  thus  the 
estimated  flow  field  from  the  assimilation  can  be  directly  compared  to  the  true  flow 
field.  The  effects  of  nonlinear  dynamics  and  the  presence  of  open  boundaries  on  esti¬ 
mated  dynamical  fields  and  their  error  estimates  can  thus  be  well  investigated.  This 
idealized  approach  is  a  necessary  first  step  before  the  actual  SYNOP  East  dataset 
can  be  assimilated  into  an  open-ocean  model. 

The  fundamental  difficulty  caused  by  strongly  nonlinear  dynamics  in  assimila¬ 
tion  studies  is  the  unpredictability  of  error  growth  as  the  model  state  evolves  forward 
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in  time.  For  simple  dynamical  models,  the  dependence  of  predictability  upon  the 
degree  of  nonlinearity  has  been  studied  by  Vallis  (1983)  and  Lacarra  &  Talagrand 
(1988).  Robinson  &  Haidvogel  (1980)  investigated  error  growth  in  a  barotropic  open 
ocean  model  identifying  three  sources  of  error:  1 )  physical  errors  due  to  inadequate 
representation  of  unresolved  scales  of  motion;  2)  computational  errors  due  to  trunca¬ 
tion  errors  in  the  discretization  and  inaccurate  implementation  of  the  open  boundaries 
and  3)  observational  errors  in  the  available  data,  including  inaccuracies  in  the  ini¬ 
tial  conditions  and  boundary  conditions.  The  propagation  of  errors  in  a  strongly 
nonlinear  model  is  a  nontrivial  issue. 

Traditionally  oceanographers  have  assimilated  data  into  nonlinear  models  by 
linearizing  the  dynamics  about  some  known  model  state  (see  Ghil  &  Malanotte- 
Rizzoli,  1991,  for  a  review  of  assimilation  methods).  This  enables  the  temporal  evo¬ 
lution  of  the  dynamical  fields  to  be  expressed  in  transition  matrix  form  (Bryson  & 
Ho,  1975),  allowing  the  propagation  of  errors  to  be  calculated,  as  is  done  in  the  ex¬ 
tended  Kalman  filter.  However,  for  many  oceanographic  cases,  such  as  the  variability 
of  the  Gulf  Stream  system  under  study  here,  a  fully  nonlinear  model  needs  to  be  used 
for  data  assimilation.  There  is  a  limited  choice  of  assimilation  methods  available 
(Evenson,  1994). 

The  adjoint  method  allows  assimilation  of  data  into  a  strongly  nonlinear  model 
without  linearizing  the  dynamics.  A  decription  of  the  adjoint  method,  as  it  is  imple¬ 
mented  here,  is  presented  in  the  appendix.  Details  on  the  history  of  this  assimilation 
method  may  be  found  in  Bennett  (1992),  Ghil  &  Malanotte-Rizzoli  (1991)  and  Ta¬ 
lagrand  &  Courtier  (1987).  In  short,  the  adjoint  method  minimizes  a  cost  function 
which  expresses  the  distance  between  the  true  observations  and  those  produced  by 
the  model.  The  minimum  of  the  cost  function  is  found  by  varying  a  set  of  control 
variables. 

Generally,  the  control  variables  are  some  subset  of  the  model  parameters. 
Schroter  &  Wunsch  (1986)  and  Tziperman  &  Thacker  (1989)  used  wind-stress  curl. 
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frictional  parameters  and  initial  conditions  as  control  variables  for  nonlinear  barotropic 
models  in  a  basin,  and  successfully  assimilated  simulated  data.  Moore  (1991)  used 
initial  conditions  as  control  variables  to  assimilate  real  satellite  altimetric  data  from 
GEOSAT  into  a  quasi-geostrophic  open  ocean  model  of  the  Gulf  Stream  system. 
The  boundary  conditions  were  kept  at  prescribed  values  throughout  the  assimilation. 
A  quasi-geostrophic  open-ocean  model  of  the  Gulf  Stream  was  also  used  by  Seiler 
(1993)  in  an  adjoint  assimilation  using  simulated  GEOSAT  data.  Here  the  boundary 
conditions  were  used  as  control  variables  and  the  initial  conditions  were  prescribed. 

None  of  these  previous  studies  have  addressed  the  issue  of  estimating  the  error 
in  the  estimated  fields  in  the  presence  of  strong  nonlinearity.  When  the  minimum  of 
the  cost  function  is  found,  the  Hessian  matrix  of  the  cost  function  may  be  inverted 
to  give  the  estimated  error  covariances  in  the  estimated  control  variables  (Thacker, 
1989).  While  this  inversion  is  possible,  the  cost  function  contains  multiple  local 
minima  when  the  dynamics  are  strongly  nonlinear,  and  the  adjoint  method  may 
provide  convergence  to  one  of  these  local  minima  and  not  to  the  global  minimum. 
Thus,  care  must  be  taken  in  interpreting  error  estimates  from  the  inverse  Hessian. 

The  first  major  scientific  objective  of  this  dissertation  is  the  optimal  estimation 
of  the  flow  field  from  sparse  data,  and  the  estimation  of  the  error  covariances,  in  the 
presence  of  strongly  nonlinear  dynamics. 

The  problem  of  open  boundaries  in  a  regional  model  has  been  with  oceanog¬ 
raphers  and  meteorologists  for  a  long  time.  Charney  et  al.  (1950)  first  addressed  the 
problem  for  dynamics  governed  by  the  barotropic  vorticity  equation.  They  discov¬ 
ered  a  sufficient  set  of  boundary  conditions  to  integrate  the  model  forward  in  time, 
namely:  vorticity  throughout  the  interior  at  the  initial  time,  streamfunction  over  all 
the  open  boundary  at  all  times,  and  vorticity  at  inflow  parts  of  the  open  boundary  at 
all  times.  This  approach  had  been  generally  accepted  until  Miller  &  Bennett  (1988) 
pointed  out  that  the  open-boundary  problem,  as  configured  by  Charney  et  al.,  is 
formally  ill-posed.  The  ill-posedness  arises  at  so-called  characteristic  points  on  the 
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open  boundary,  where  the  component  of  the  flow  normal  to  the  open  boundary  is  zero. 
At  these  points,  the  streamline  is  locally  tangent  to  the  boundary,  and  the  transition 
from  inflow  to  outflow  occurs.  Hence,  there  is  an  overspecification  of  vorticity  in  the 
limit  as  one  approaches  a  characteristic  point,  due  to  the  specification  of  vorticity  at 
the  inflow  in  tandem  with  the  conservation  of  vorticity  along  the  streamline.  The 
overspecification  of  vorticity  at  characteristic  points  can  give  rise  to  infinite  vorticity 
gradients,  Bennett  &  Kloeden  (1981).  This  rather  subtle  point  is  discussed  at  length 
in  chapter  eight  of  Bennett  (1992).  This  problem  may  be  circumvented  to  some  degree 
by  applying  a  smoothing  operator  along  the  boundaries,  as  was  done  in  the  numerical 
experiments  of  Robinson  &  Haidvogel  (1980).  The  ill-posedness  of  the  forward  model 
at  characteristic  points  on  the  boundary  is  not  an  objective  of  the  present  study.  In 
this  study  the  boundary  vorticity  is  smoothed  as  part  of  the  assimilation,  preventing 
the  formation  of  large  vorticity  gradients. 

With  the  aim  of  assimilating  SYNOP  East  data  into  an  open  ocean  model, 
there  are  two  choices  available  for  the  implementation  of  open  boundaries.  The  first 
is  to  specify  ahead  of  time,  the  boundary  values  of  dynamical  quantities  sufficient  to 
integrate  the  model.  Until  now  this  has  been  the  most  common  technique  used  in 
open  ocean  assimilation  studies  (see  Moore,  1991;  Robinson  et  al.,  1988).  However, 
in  real  situations  the  dynamical  fields  at  the  open  boundaries  are  generally  unknown. 
Modelers  have  therefore  usually  estimated  the  boundary  values  from  some  climato¬ 
logical  large-scale  dataset  such  as  Levitus  (1982),  or  taken  the  boundary  values  from 
a  basin-scale  model  in  which  the  local  domain  is  embedded.  With  these  approaches  a 
sponge  layer  may  be  included  around  the  boundaries  to  lessen  the  influence  of  errors 
due  to  the  imposed  boundary  values  (Cummins  &  Mysak,  1988,  Chapman,  1985). 

A  more  desirable  yet  more  difficult  approach  is  to  estimate  the  open  boundary 
conditions  as  part  of  the  assimilation  study.  The  dynamics  may  be  used  to  extrap¬ 
olate  information  from  the  observations  to  the  boundaries.  Zou  et  al.  (1995)  used 
a  sequential  estimation  scheme  to  estimate  open  boundary  conditions  from  interior 
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data  for  a  barotropic  model.  The  size  of  the  estimation  problem  was  reduced  by 
estimating  the  boundary  conditions  at  one  time  from  the  interior  data  only  at  that 
time.  Time-dependent  boundary  conditions  were  estimated  by  Seiler  (1993)  using  the 
adjoint  method  with  quasigeostrophic  dynamics.  Also  Bennett  &  McIntosh  (1982) 
used  a  variational  formulation  with  a  linearized  shallow— water  equation  model  to  es¬ 
timate  boundary  conditions  in  a  coastal  model.  These  previous  studies  focussed  on 
the  estimation  of  the  boundary  conditions  and  did  not  include  the  initial  conditions 
in  the  assimilation. 

The  major  difficulty  with  time-dependent  boundary  conditions  is  that  in  gen¬ 
eral  there  are  a  great  number  of  boundary  values  to  be  estimated,  from  relatively  few 
data  points.  To  address  this  difficulty,  terms  can  be  included  in  the  cost  function  to 
penalize  departures  of  the  first-order  statistics  of  the  boundary  conditions  during  the 
assimilation,  from  their  a  priori  values.  The  a  priori  first— order  statistics  can  be  the 
expected  variance  of  the  boundary  values  about  some  mean  value,  and  their  expected 
spatial  smoothness  (Thacker,  1988). 

The  second  major  scientific  objective  of  this  dissertation  is  thus  the  optimal 
estimation  of  all  of  the  initial  and  boundary  conditions  necessary  to  integrate  the 
open-ocean  model  forward  in  time,  in  order  to  fit  sparse  interior  data. 

It  will  be  shown  that  estimating  the  error  in  the  estimated  dynamical  fields  in 
the  presence  of  strong  nonlinearity  is  possible.  The  full  error  covariance  matrix  for 
the  control  variables  can  be  computed  from  the  inverse  of  the  Hessian  even  when  the 
model  dynamics  are  strongly  nonlinear.  This  is  a  major  new  result  as  no  previous 
assimilation  studies  using  the  adjoint  method  have  attempted  to  compute  full  error 
covariances  with  a  nonlinear  dynamical  model. 

The  second  major  new  result  is  the  successful  estimation  of  the  full  set  of 
necessary  boundary  conditions  specified  by  Charney  et  al.  (1950).  It  is  found  that  the 
interior  data  are  sensitive  to  the  boundary  vorticity  and  streamfunction,  via  advection 
in  the  prognostic  equation  for  vorticity.  The  interior  data  are  also  strongly  sensitive 
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to  the  boundary  streamfunction  due  to  the  elliptic  character  of  the  Poisson  equation 
for  streamfunction.  This  latter  effect  is  instantaneous.  From  a  time-invariant  first 
guess  for  the  boundary  conditions,  ie.  persistence,  the  adjoint  method  is  able  to  create 
temporal  variations  in  the  boundary  conditions  so  that  the  ensuing  interior  flow  field 
matches  the  signals  recorded  in  the  data. 

This  dissertation  is  divided  into  two  parts  each  of  which  uses  a  different  set  of 
control  variables,  that  is  independent  model  parameters  that  are  varied  in  order  to 
minimize  a  model/data  misfit.  In  the  first  part  of  this  study,  the  open— ocean  boundary 
conditions  are  treated  a.s  known  in  the  assimilation  experiments,  and  the  model  is 
constructed  in  such  a  manner  that  the  prescribed  boundary  conditions  have  minimal 
influence  on  the  assimilation  of  data  into  the  dynamical  model.  For  this  first  part, 
the  control  variables  are  the  initial  conditions  for  vorticity.  Chapter  one  describes  the 
set-up  of  the  simulated  ocean,  and  presents  several  twin  experiments  where  different 
prescriptions  of  the  boundary  conditions  are  used.  The  formalism  to  derive  the  error 
covariance  matrix  for  the  control  variables  from  the  inverse  Hessian  matrix  of  the 
cost  function  at  its  minimum  is  presented  in  chapter  two.  Error  estimates  for  the 
vorticity  initial  conditions  are  calculated  using  two  different  methods  that  illustrate 
the  effects  of  nonlinearity.  As  this  is  a  major  new  result,  the  calculation  of  the  error 
covariance  matrix  and  its  off-diagonal  structure  are  explored  in  detail.  The  sensitivity 
of  the  observed  velocities  in  the  estimated  flow  to  the  vorticity  initial  conditions  is 
also  shown. 

In  the  second  part  of  the  dissertation  the  boundary  conditions  for  streamfunc¬ 
tion  and  vorticity,  as  well  as  the  initial  conditions  for  vorticity,  are  used  as  control 
variables  in  the  adjoint  method.  The  number  of  control  variables  to  be  estimated 
is  reduced  by  expanding  the  temporal  behavior  of  each  boundary  value  of  stream- 
function  and  vorticity  in  Fourier  series.  This  expansion  is  presented  in  chapter  three, 
along  with  the  results  of  several  twin  experiments  illustrating  the  dependence  of  the 
assimilation  on  the  availability  of  a  priori  information  about  the  boundary  conditions. 
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Experiments  are  run  with  noise  added  to  the  observations,  and  to  the  first  guess  for 
the  control  variables.  Objective  mapping  of  a  simulated  streamfunction  survey  is 
used  to  construct  a  first  guess  for  the  control  variables.  These  latter  experiments 
test  the  ability  of  the  adjoint  method  to  reconstruct  the  flow  field  from  more  realistic 
knowledge. 

In  chapter  four  the  quality  of  the  estimated  initial  and  boundary  conditions 
is  assessed  by  examining  the  residuals  with  the  true  values,  and  again  by  calculating 
the  error  covariance  matrix  from  the  inverse  Hessian.  The  sensitivity  of  the  model 
counterparts  of  the  observations  to  perturbations  in  the  estimated  boundary  and 
initial  conditions  as  well  as  the  means  by  which  information  is  spread  in  time  and  space 
is  discussed.  Error  maps  are  then  estimated  for  the  interior  fields  of  streamfunction 
and  vorticity,  by  a  Jacobian  transformation  of  the  error  covariance  matrix  for  the 
boundary  and  initial  conditions. 

Details  of  the  formulation  of  the  adjoint  method  is  presented  in  the  appendix, 
starting  from  the  general  form  and  then  deriving  the  full  adjoint  model  for  the  finite- 
difference  forward  model.  For  the  mathematical  formulae  throughout  this  disserta¬ 
tion,  vectors  are  denoted  by  lowercase  boldface  characters  and  matrices  are  denoted 
by  uppercase  boldface  characters. 
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Chapter  1 


Estimating  initial  conditions 


In  this  chapter  the  adjoint  method  is  used  to  optimally  estimate  the  vorticity  initial 
conditions  for  a  quasi-geostirophic  model  that  minimize  the  misfit  between  model 
and  data  at  later  times.  The  simulated  experiments  presented  here  are  designed 
to  simulate  the  observing  array  from  the  SYNOP  East  project,  which  consisted  of 
an  array  of  current  meters  that  spanned  the  Gulf  Stream  and  most  of  the  southern 
and  northern  recirculations.  There  was  also  an  array  of  tomographic  transceivers  in 
the  southern  recirculation.  Results  from  these  observations  can  be  found  in  Chester 
(1993). 

The  difficulties  of  implementing  boundary  conditions  in  open  ocean  modeling 
were  discussed  in  the  introduction.  A  simple  first  approach  taken  in  this  chapter  is 
to  specify  the  the  values  of  streamfunction  and  vorticity  on  the  boundaries  before 
the  assimilation,  based  upon  any  available  information.  The  effect  of  the  assumed 
boundary  values  on  the  interior  flow  field  can  be  diminished  by  enlarging  the  width 
of  the  model  domain  in  order  that  the  edges  of  the  domain  are  far  from  from  the 
observation  sites.  This  is  an  impractical  fix  as  the  number  of  interior  grid  points 
increases  with  the  square  of  the  number  of  boundary  points.  For  a  jet-like  flow 
over  a  long  time  period  the  boundaries  would  have  to  be  a  great  distance  from  the 
observations,  so  the  number  of  model  variables  to  estimate  would  become  quite  large. 
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Also  the  model  domain  would  become  so  broad  that  most  of  the  interior  would  be 
unobservable  from  the  observations.  A  somewhat  better  fix  is  to  impose  a  sponge 
layer  around  the  edges  of  the  model  domain,  with  the  hope  that  this  will  diminish 
the  influence  of  the  assumed  boundary  conditions  on  the  model/data  misfit  in  the 
interior. 

Tziperman  &  Thacker  (1989)  used  the  adjoint  method  to  estimate  the  vor- 
ticity  initial  conditions  for  a  barotropic  quasi-geostrophic  model  of  the  subtropical 
gyre,  from  synthetic  observations  of  streamfunction  and  vorticity.  Their  forward  and 
adjoint  model  codes  were  used  as  a  basis  for  the  estimation  experiments  presented 
here.  The  model  codes  were  modified  to  include  periodic  boundary  conditions  and 
observations  of  velocity. 

The  simulation  of  the  Gulf  Stream  is  first  described,  then  follows  a  section 
on  the  cost  function  and  the  simulated  observations.  Details  on  the  adjoint  method, 
which  involves  quite  elaborate  mathematics,  can  be  found  in  the  appendix.  Twin 
experiments  are  then  presented,  and  their  results  show  the  effectiveness  or  otherwise 
of  the  assimilation  scheme. 


1.1  Modeling  the  Gulf  Stream 

The  model  used  here  to  represent  the  ocean  dynamics  is  the  nonlinear  barotropic 
conservation  of  vorticity  on  a  /3-plane  with  bottom  friction  and  biharmonic  friction 
(Pedlosky,  1979), 

+  J  (^,  0  + = -«(  -  aV-C 

where 

C  =  vV- 


18 


These  two  equations  will  be  referred  to  as  the  forward  model.  When  space  and  time 
are  discretized,  the  prognostic  equation  in  finite-difference  form  is 


mi  _ 

St, 7  St 


At 


^  +  J  i’l’bAi)  +  ^ 


2Ax 


+  ^b(!j  +  ,  —  0  (1.1) 


where  the  grid  spacing,  Ax  and  Ay,  is  20  km,  and  the  time  step.  At,  is  one  hour.  With 
this  choice  of  discretization  the  prognostic  equation  is  stable  in  time  using  the  Euler- 
forward  form  of  the  time  derivative.  The  standard  nine-point  Arakawa  Jacobian  and 
the  five-point  Laplacian  are  used  in  the  discretized  forms.  The  model  domain  is  a 
zonally  periodic  1,280  x  1,280  km  square.  Streamfunction,  and  vorticity,  at 
each  interior  grid  point  and  time-step  are  called  the  state  variables. 

Charney,  Fj0rtoft  and  von  Neumann  (1950)  investigated  the  open  boundary 
conditions  that  are  appropriate  to  integrate  this  same  model  forward  in  time.  They 
found  that  a  sufficient  set  of  boundary  conditions  are:  vorticity  at  all  interior  points  at 
the  initial  time,  streamfunction  at  all  boundary  points  at  all  time-steps  and  vorticity 
at  all  inflow  points  on  the  boundary  at  all  time-steps  (see  also  Bennett,  1992,  chapter 
8).  Denoted  the  CFvN  conditions,  they  hold  when  the  boundary  “is  not  an  interface 
between  one  system  and  another.  Instead,  the  boundary  merely  defines  a  sub-region 
of  a  single  system.”  (Bennett,  1992).  One  of  the  main  aims  of  this  study  is  the 
estimation  of  the  CFvN  boundary  conditions  for  a  sub-region  of  the  zonally  periodic 
channel. 

The  Gulf  Stream  is  modeled  by  a  thin  zonal  jet  that  meanders  and  goes 
barotropically  unstable  over  a  period  of  24  days.  The  streamfunction  and  vortic¬ 
ity  fields  are  shown  in  figure  1.1.  The  model  parameters  used  to  create  this  simulated 
flow  field  are  based  upon  the  numerical  experiments  of  Flierl  et  al.  (1987).  The  me¬ 
andering  is  made  more  pronounced  by  setting  /3  to  zero  for  the  experiments  in  this 
chapter  and  in  chapter  three. 

The  barotropic  vorticity  equation  is  clearly  an  oversimplification  of  the  physics 
of  the  Gulf  Stream.  However  it  does  enable  one  to  construct  a  flow  field  that  contains 
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Figure  1.1:  The  evolution  of  the  jet  over  24  days,  plotted  are  contours  of  (left)  stream- 
function  (C.L=5000  m^/s),  and  (right)  vorticity  (C.L=  2  X  10“®  5“^),  every  8  days  over 
the  zonally  periodic  domain  (1280  kmx  1280  km,  with  a  grid  spacing  of  20  km).  The 
sub-domain  boundaries  used  in  the  twin  experiments  are  drawn  on  the  streamfunction 


some  of  the  principal  features  of  the  Gulf  Stream  using  many  fewer  state  variables  than 
more  complicated  models.  The  nonlinearity  in  the  barotropic  vorticity  equation  is  the 
strong  advection  of  vorticity  by  the  velocity  field,  which  is  regarded  as  a  dominant 
dynamical  feature  of  the  Gulf  Stream.  The  feature  most  obviously  lacking  is  vertical 
variation.  The  barotropic  component  of  the  Gulf  Stream  increases  after  it  leaves  Cape 
Hatteras  due  to  the  northern  and  southern  recirculation  gyres  (Hogg,  1992).  Also, 
much  of  the  variability  in  the  region  of  the  southern  recirculation  gyre  is  contained 
in  the  barotropic  mode  and  the  first  and  second  baroclinic  modes  (Hua  et  ai,  1986, 
Richman  et  ai,  1976).  For  the  assimilation  of  real  instead  of  synthetic  data,  an 
adequate  model  for  the  Gulf  Stream  and  its  recirculations  could  be  constructed  using 
quasi-geostrophic  dynamics  in  the  barotropic  and  first  few  baroclinic  modes,  with  the 
same  assimilation  procedure  as  is  used  here.  The  inclusion  of  vertical  variability  is  a 
necessary  next  step  for  this  line  of  research. 

1.2  The  cost  function 

By  varying  a  vector  of  control  variables,  0,  the  assimilation  forces  the  forward  model 
to  create  a  flow  field  that  agrees  with  the  available  data.  The  control  variables  may 
be  any  independent  parameters  of  the  model.  The  available  data  is  organised  as  a 
vector,  d,  which  is  related  to  the  control  variables  by  the  sampling  equation, 

m(0)  +  n  =  d  (1.2) 

where  m(0)  is  the  vector  of  model  counterparts  of  the  data,  and  n  are  the  errors 
involved  in  representing  d  by  m(0).  One  can  also  think  of  m(0)  as  the  data  created 
by  the  control  variables,  or  the  control  vector  mapped  onto  the  data  space. 

The  differences  between  the  data  and  the  model  counterparts  of  the  data  should 
be  reasonably  small.  This  difference  is  expressed  in  a  least  squares  form  termed  the 
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(1.3) 


cost  function, 

j  _  1  v  (^fc(^)  -  dkf 

”  2  V 

where  dk  is  the  fcth  data  point,  mk{0)  is  the  model  counterpart  of  the  A:th  data  point, 
and  al  is  the  expected  variance  of  the  observational  error  in  the  fcth  data  point, 
nfc.  For  real  observations,  the  observational  error  is  comprised  of  both  the  difference 
between  the  data  and  their  true  values  due  to  instrument  noise  and  sampling  error, 
and  the  difference  between  the  data  and  their  model  counterparts  due  to  inadequacies 
in  the  model.  The  assimilation  searches  for  the  control  variables,  6,  that  minimize  J . 
Equation  (1.3)  can  be  rewritten  in  matrix/vector  form  as 


J  =  ^(m(0)  -  d)^R  -  d)  (1.4) 

where  R  =<  nn^  >  is  the  covariance  matrix  for  the  observational  errors  ^  ,  n.  It 
is  a  diagonal  matrix  with  the  down  the  main  diagonal.  The  vector  of  residuals, 
m(^)  —  d,  is  called  the  model/data  misfit. 

The  observations  to  be  used  as  data  for  the  assimilation  are  of  flow  velocity 


from  an  array  of  simulated  current  meters  at  various  points  in  the  model  domain. 
The  eastward  and  northward  components  of  flow  velocity  at  grid  point,  and 

time-step  t  are  determined  from  the  streamfunction  using  center-differences 


[dy],.-  2 Ay  ’  \dx),j~  2 Ax 


With  this  discretization,  streamfunction,  velocity  and  vorticity  all  occupy  the  same 
grid-points.  Also  there  are  observations  of  line-averaged  flow  velocity  between  two 
points,  to  simulate  reciprocal  travel-times  from  acoustic  tomography.  For  a  sound 
wave  traveling  between  two  acoustic  transceivers  in  a  fluid,  the  difference  in  travel 
time  going  in  opposite  directions  is  a  measure  of  the  integrated  along-path  fluid  speed 
(see  Chester,  1993,  for  details).  Four  simulated  transceivers  are  placed  at  the  points 


^The  delimiters  <  . . .  >  denote  the  expected  value  of  a  quantity  over  a  large  number  of  realizations 
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of  a  cross  shape  in  the  barotropic  model  domain.  The  mean  zonal  flow  speed  between 
the  west  transceiver,  and  the  east  transceiver,  E,  at  time  t,  is  given  by 


^WE  ~  ~~  r  I  “  T  ^ 

LJe  dy  L  -f-; 


jV+^ 


Ax 


dy  L  V  2Aj/ 

and  the  mean  meridional  flow  speed  between  the  south  transceiver,  5",  and  the  north 
transceiver,  JV,  at  time  t,  is  given  by 
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where  L=200  km  is  the  transceiver  separation. 

The  cost  function  is  defined  for  current  meter  observations  and  Nr  tomo¬ 
graphic  observations  as 
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where  cr„  =  4  x  10"^m/s  is  the  a  priori  estimate  of  the  error  in  the  velocity  observa¬ 
tions,  and  CTr  =  4  X  10”^m/s  is  the  a  priori  estimate  of  the  error  in  the  tomographic 
observations.  This  choice  of  a  priori  error  is  taken  from  a  rule  of  thumb  that  the 
error  in  a  velocity  observation  is  of  the  order  of  5%  of  the  total  variability  (Nelson 
Hogg,  personal  communication). 

In  this  chapter  the  control  variables  are  the  vorticity  initial  conditions.  The 
vorticity  initial  conditions  are  related  to  the  model  counterparts  of  the  data  (the 
spatial  derivatives  of  streamfunction  at  ik,jk,tk)  through  the  forward  model,  equation 
(1.1).  When  the  assimilation  is  performed  so  as  to  minimize  the  cost  function,  in 
theory  the  cost  function  could  be  reduced  to  zero  if  the  model  is  perfectly  consistent 
with  the  data,  and  the  data  are  noise-free.  However,  if  there  are  errors  in  the  data 
or  in  the  model,  the  aim  is  to  minize  the  cost  function  only  to  the  level  at  which  the 
residuals  in  (1.5)  are  of  the  same  order  as  the  observational  error.  If  the  errors  are 
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zero,  everything  is  perfect  and  the  final  value  of  J  will  be  zero.  The  minimization  is 
performed  using  the  adjoint  method,  details  of  which  are  given  in  the  appendix. 

As  well  as  the  model/data  misfit,  other  terms  may  be  included  in  the  cost 
function  that  penalize  the  mean-square  of  residuals.  Knowledge  of  the  first-order 
statistics  of  the  control  variables  can  be  used  in  the  cost  function  to  improve  the 
minimization.  That  is,  one  usually  has  an  a  priori  estimate  of  the  expected  value  or 
mean,  Oq,  and  the  covariance  of  this  expected  value  about  the  unknown  true  values 
of  the  control  variables,  Ree,  ie. 

R,,  =<  (0  -  0o){e  -  Oof  > 

Then  the  least-squares  term 

-  «o) 

is  added  to  the  cost  function,  J,  where  Ng  is  the  number  of  control  variables.  If 
the  mean  and  variance  of  the  control  variables  is  totally  unknown,  one  would  give 
the  control  variables  infinite  a  priori  variance  reflecting  the  fact  that  the  control 
variables  can  assume  any  value.  In  this  case  the  above  least-squares  term  is  zero.  In 
oceanography  one  usually  has  some  idea  of  the  first-order  statistics  of  the  dynamical 
variables  being  used  as  control  variables. 

For  each  experiment  in  this  study,  6q  is  set  as  the  first  guess  for  the  control 
variables,  and  Rgg  is  set  as  a  diagonal  matrix  with  the  expected  variances  of  each  con¬ 
trol  variable  down  the  main  diagonal.  For  the  experiments  presented  in  this  chapter 
the  expected  variance  of  the  vorticity  initial  conditions  is  determined  by  averaging 
over  the  true  flow  field.  A  mean  measure  of  variance  is  chosen  that  is  the  same  for 
all  of  the  vorticity  initial  conditions.  For  the  experiments  of  chapter  three,  where 
the  boundary  conditions  for  streamfunction  and  vorticity  are  included  in  the  control 
vector,  the  choice  of  Res  will  be  seen  to  require  more  careful  consideration. 
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Also  added  to  the  cost  function  is  a  mean-square  term  penalizing  the  Laplacian 


of  the  vorticity  initial  conditions  (the  control  variables  in  this  chapter),  ie. 


2Ne 


where  s  is  the  a  ■priori  expected  value  of  the  mean-square  Laplacian  of  the  vorticity 
field,  here  evaluated  from  the  true  flow  field.  The  effect  of  minimizing  this  term  is 
to  enforce  spatial  smoothness  (see  Thacker,  1988,  for  more  details).  Rewriting  this 
smoothness  term  in  matrix/vector  form,  it  can  be  added  to  the  previous  size  term,  to 
produce  an  a  priori  covariance  matrix  for  the  control  vector,  Pq,  defined  in  the  cost 
function  by 


{0  -  dofPo-\0  -  0o)  =  {0-  Oof  [R.fl7  +  L'^S-^l]  {0  -  0o) 

where  L  is  the  Laplacian  operator  in  matrix  form  and  S  is  a  diagonal  matrix  with 
the  value  of  s  along  the  diagonal. 

Thus  the  cost  function  consists  of  three  parts:  size,  smoothness  and  model/data 
misfit.  In  a  successful  minimization  these  should  all  be  reduced  to  order  one.  The 
rms  variances  selected  for  the  weights  in  the  size  and  smoothness  terms,  are  deter¬ 
mined  from  the  true  flow  field.  For  real  data,  one  could  put  bounds  on  the  size  and 
smoothness  of  streamfunction  and  potential  vorticity,  by  invoking  quasigeostrophy  or 
by  studying  output  from  a  general  circulation  model. 


1.3  Twin  experiments 

The  objective  is  now  to  reconstruct  the  true  ocean  state  from  sparse  observations  using 
some  assimilation  scheme.  For  the  purpose  of  assessing  the  success  of  the  assimilation 
scheme,  and  to  study  the  spread  of  information  from  the  observations  into  regions 
devoid  of  data,  the  twin  experiment  approach  is  used  here  (Ghil  &  Malanotte-Rizzoli, 
1991). 
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In  an  identical  twin  experiment,  a  control  model  run  provides  a  simulated  ocean 
state  as  a  true  reference  ocean,  from  which  simulated  observations  are  taken.  The 
model  is  then  started  from  some  false  initial  conditions  (that  are  significantly  different 
from  the  true  initial  conditions)  and  integrated  forward  in  time.  An  assimilation 
scheme  is  used  to  force  the  so-called  false  flow  field  towards  the  true  flow  field  using  the 
observations.  The  success  of  the  assimilation  scheme  in  rendering  the  false  flow  field  to 
be  more  like  the  true  flow  field  can  be  assessed.  Malanotte-Rizzoli  and  Young  (1992), 
used  a  primitive-equation  model  of  the  Gulf  Stream,  taking  simulated  observations 
from  array  configurations  similar  to  those  used  in  the  SYNOP  experiment.  Their 
sequential  estimation  scheme  was  able  to  force  the  model  from  a  false  time-dependent 
state  towards  the  true  state  from  which  the  observations  had  been  taken. 

In  this  study  so-called  fraternal  twin  experiments  are  performed,  as  distinct 
from  identical  twin  experiments.  In  identical  twin  experiments  the  model  used  to 
assimilate  data  from  the  true  flow  field,  has  the  same  model  parameters  as  the  model 
used  to  create  the  true  flow  field.  That  is,  domain  size,  friction,  etc.  are  kept  the 
same.  In  fraternal  twin  experiments  the  model  parameters  used  for  the  assimilation 
differ  in  some  way  from  the  those  of  the  true  flow  field.  In  the  following  experiments 
the  domain  size  of  the  model  used  for  the  assimilation  is  a  smaller  sub-domain  of  the 
true  flow  field. 

When  the  domain  size  of  the  assimilation  is  kept  as  large  as  the  domain  size 
of  the  true  flow  field  domain,  there  are  a  large  of  number  of  control  variables  (initial 
conditions  of  vorticity  at  every  interior  grid  point)  and  the  assimilation  runs  slowly 
and  requires  a  lot  of  memory.  Moreover,  for  an  observing  array  that  does  not  have 
great  spatial  coverage  relative  to  the  size  of  the  true  flow  field  domain,  much  of 
the  domain  can  not  be  well  estimated  from  the  data.  In  the  parlance  of  control 
theory,  those  parts  of  the  domain  where  the  vorticity  initial  conditions  are  not  well 
determined  from  the  data  are  said  to  be  unobservable.  The  model  domain  used  for  the 
assimilation  in  the  various  twin  experiments  to  be  presented  here,  is  constructed  so  as 
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to  enclose  the  observing  array  and  be  wide  enough  to  allow  the  spread  of  information 
from  the  observations. 

Starting  from  some  first  guess  for  the  vorticity  initial  conditions,  the  adjoint 
method  is  used  to  optimally  estimate  the  vorticity  initial  conditions  that,  when  in¬ 
tegrated  forward  in  time,  best  fit  the  observations  by  minimizing  the  cost  function. 
The  observations  from  the  true  flow  field  are  taken  only  from  the  observing  array  in 
the  interior  so  the  boundary  values  of  streamfunction  and  vorticity  are  not  known. 
The  approach  taken  in  this  chapter  is  to  specify  the  values  of  streamfunction  and 
vorticity  on  the  boundary  by  making  some  a  'priori  assumptions  about  the  expected 
flow  field  in  the  region.  Moore  (1991)  estimated  vorticity  initial  conditions  from  real 
data  using  the  adjoint  method  for  a  quasi-geostrophic  model  of  the  Gulf  Stream, 
keeping  the  open-ocean  boundary  conditions  fixed  at  values  taken  from  a  larger  scale 
model  combined  with  smoothed  data.  The  level  of  assumed  prior  knowledge  of  the 
boundary  conditions  for  the  experiments  presented  here  is  much  lower  than  in  Moore 
(1991). 

A  sponge  layer  is  introduced  over  several  grid  points  from  the  boundary  into 
the  interior.  In  the  sponge  layer  the  bottom  friction  coefficient,  ei,,  is  ramped  in  space 
such  that  it  goes  from  a  low  value  in  the  interior  to  a  high  value  at  the  boundary  over 
the  space  of  a  few  grid  points.  The  purpose  of  the  sponge  layer  is  to  decelerate  the 
flow  in  a  layer  near  the  boundaries.  It  is  hoped  that  this  will  have  the  same  effect  as 
removing  the  boundaries  of  the  sub-domain  to  a  greater  distance  from  the  observing 
array,  and  thus  reduce  the  influence  of  the  specified  boundary  values  on  the  interior 
flow  field  seen  by  the  observing  array. 

It  is  desirable  that  the  estimation  of  the  vorticity  initial  conditions  be  in¬ 
fluenced  as  much  as  possible  by  the  observations  and  less  by  the  specified  boundary 
conditions  for  streamfunction  and  vorticity.  From  the  forward  model  equations,  (1.1), 
one  can  see  that  there  are  three  processes  by  which  the  boundary  conditions  may  in- 
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fluence  the  interior  fields:  (i)  advection,  (ii)  wave  propagation,  and  (iii)  the  elliptic 
character  of  the  Poisson  equation  for  streamfunction. 

The  sponge  layer  reduces  the  magnitude  of  advection  in  the  vicinity  of  the 
boundary,  so  that  for  the  first  process  the  influence  of  the  boundary  values  of  stream- 
function  and  vorticity  on  the  observations  of  velocity  in  the  interior  is  reduced.  For  a 
short  enough  assimilation  period,  the  boundary  streamfunction  and  vorticity  will  not 
be  advected  through  the  observing  array,  but  obviously  when  the  model  is  integrated 
long  enough  all  of  the  interior  points  will  feel  the  effect  of  the  specified  boundary 
conditions  through  advection. 

For  the  experiments  presented  here,  beta  is  deliberately  set  to  zero  so  that  the 
meandering  of  the  jet  is  more  pronounced  (see  Flierl  et  al,  1987).  Thus  there  are  no 
planetary  waves  to  spread  information  through  the  domain.  Using  actual  data  from 
SYNOP  East  one  would  include  beta  in  the  model  dynamics.  Although  compared 
to  advection  in  the  vicinity  of  the  jet,  planetary  waves  would  probably  have  little 
importance  in  the  spread  of  information. 

For  the  third  process,  at  any  time-step  the  value  of  streamfunction  at  any 
interior  point  is  dependent  upon  the  local  value  of  vorticity  and  the  boundary  values 
of  streamfunction  (see  Morse  &  Feshbach,  p.  696,  1953).  Thus  perturbations  of  the 
boundary  values  of  streamfunction  instantaneously  affect  the  streamfunction  field 
throughout  the  interior.  This  process,  or  mechanism,  will  be  seen  to  be  important 
in  the  estimation  of  time-dependent  boundary  streamfunction  in  chapters  three  and 
four.  For  the  experiments  in  this  chapter  where  boundary  streamfunction  is  specified 
ahead  of  time,  and  velocity  observations  are  used,  this  third  process  does  not  have 
much  impact  on  the  estimation  of  the  optimal  vorticity  initial  conditions.  If  the 
available  data  were  interior  streamfunction  observations  the  boundary  streamfunction 
would  strongly  influence  the  model  counterparts  of  the  observations  via  this  process. 
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1.4  Results 


For  the  first  twin  experiment,  case  Al,  an  array  of  thirteen  current  meters  are  deployed 
that  span  the  central  part  of  the  jet  shown  in  figure  1.1.  This  array  type  was  chosen  as 
it  has  a  similar  configuration  to  the  central  and  eastern  arrays  in  the  SYNOP  project. 
The  model  domain  for  the  assimilation  covers  an  800  x  800  km  square  that  encloses 
the  current  meter  array.  The  data  consists  of  eastward  velocity,  n,  and  northward 
velocity,  u,  recorded  at  each  of  the  thirteen  current  meter  locations  every  two  days 
over  a  24  day  period. 

For  this  case  it  is  assumed  that  nothing  is  known  about  the  background  values 
of  streamfunction  and  vorticity  in  the  model  domain  and  outside  of  the  model  domain 
except  that  there  is  some  zonal  character  to  the  flow.  Thus  streamfunction  and 
vorticity  are  arbitrarily  set  to  zero  on  the  northern  and  southern  boundaries  for  all 
time,  and  are  made  periodic  on  the  eastern  and  western  boundaries.  The  first  guess  for 
the  vorticity  initial  conditions  is  zero  at  all  grid  points  at  the  start  of  the  assimilation. 

In  figure  1.2  snapshots  are  shown  of  the  streamfunction  for  the  true  flow  field 
shown  over  the  sub-domain  used  for  the  assimilation  (this  is  a  subplot  of  figure  LI), 
and  for  the  streamfunction  resulting  from  the  optimally  estimated  vorticity  initial 
conditions,  every  eight  days.  The  large  amplitude  signals  in  the  observations  are 
quite  well  estimated  in  the  false  flow  field.  The  location  of  the  jet  as  it  meanders,  is 
well  determined  in  the  vicinity  of  the  observing  array.  However  low  amplitude  signals 
in  the  observations  are  not  well  estimated.  The  flow  to  the  north  and  south  of  the 
jet  in  the  estimated  flow  field  does  not  resemble  the  flow  in  the  true  flow  field. 

The  assimilation  shows  some  skill  at  estimating  the  spatial  structure  of  the 
flow  field  downstream  of  the  observing  array  at  later  times.  This  is  because  the  flow 
that  passes  through  the  array  is  well  estimated,  and  in  this  highly  advective  flow,  the 
influence  is  carried  away  downstream.  In  the  regions  of  the  false  model  domain  that 
are  devoid  of  data,  the  estimates  of  streamfunction  and  vorticity  produced  by  the 
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Figure  1.2:  Streamfunction  every  two  days  for  (left)  the  true  flow  field  over  the  smaller 
domain,  (right)  Case  A1  :  the  estimated  flow  field,  the  dashed  line  marks  the  sponge  layer. 
(C.I.=5000  mVs) 


assimilation  are  strongly  influenced  by  the  specified  boundary  conditions.  The  pairs 
of  eddies  seen  in  the  true  flow  field  at  later  times  (bottom  frames  of  figure  1.1)  can 
not  be  well  estimated  in  the  false  flow  field  because  the  imposed  boundary  conditions 
of  the  sub— domain  are  inconsistent  with  their  corresponding  values  in  the  true  flow 
field. 

For  case  B1  an  array  of  five  current  meters  in  a  cross  shape  are  deployed  to 
the  north  of  the  jet.  They  have  the  same  locations  as  the  top  five  current  meters  in 
the  array  used  in  case  Al.  There  are  acoustic  tomography  transceivers  at  the  four 
outer  points  of  the  cross  .  With  this  observing  array  design,  the  current  meters  and 
the  tomography  can  be  readily  compared  within  the  context  of  the  assimilation.  The 
data  consists  of  u  and  v  recorded  at  the  five  current  meter  locations  every  two  days 
over  a  24  day  period,  and  twe  and  tsn  at  the  same  times.  The  true  flow  field  over 
the  sub-domain  used  for  this  case  is  shown  on  the  left  of  figure  1.3.  Being  situated 
to  the  north  of  the  jet,  the  array  does  not  observe  the  jet  until  it  swings  through  as 
a  meander  from  days  16  to  24. 

The  sub-domain  for  this  case  is  25  by  25  grid  points,  covering  an  area  of  480 
X  480  km.  As  in  case  Al,  in  the  absence  of  any  other  knowledge,  save  that  there 
is  some  zonal  character  to  the  flow,  the  boundary  conditions  for  streamfunction  and 
vorticity  are  set  to  zero  on  the  northern  and  southern  edges  of  the  model  domain, 
and  are  made  periodic  in  the  east-west  direction,  for  all  time-steps.  The  first  guess 
for  the  vorticity  initial  conditions  at  the  start  of  the  assimilation  is  zero. 

This  observing  array  configuration  and  sub-domain  size  are  used  for  the  re¬ 
mainder  of  the  twin  experiments  in  this  study.  The  choice  of  a  sub-domain  that  does 
not  include  the  jet  was  motivated  by  the  intention  of  eventually  calculating  eddy 
potential  vorticity  flux  divergences  in  the  southern  recirculation  gyre.  The  real  data 
that  would  be  used  in  an  assimilation  to  calculate  the  potential  vorticity  fields,  would 
come  from  the  lower  part  of  the  SYNOP  East  array  that  is  situated  in  the  recircula- 
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tion.  Also,  the  acoustic  tomographic  transceivers  are  situated  only  in  the  lower  part 
of  the  SYNOP  East  array  (Chester,  1993). 

In  figure  1.3  snapshots  are  shown  of  the  streamfunction  for  the  true  flow  field 
and  for  the  optimally  estimated  false  flow  field  at  eight  day  intervals.  It  is  apparent 
that  the  flow  field  is  not  estimated  well  here.  Setting  streamfunction  and  vorticity 
to  zero  on  the  southern  boundary  for  all  time  is  inconsistent  with  the  location  of  the 
jet  in  the  true  flow  field.  The  assimilation  puts  an  eddy  in  the  sponge  layer  to  the 
southwest  that  evolves  in  time  to  become  the  meander  event  that  is  observed  at  later 
times. 

The  true  and  estimated  vorticity  fields  for  case  B1  are  shown  in  figure  1.4.  The 
lack  of  resemblance  between  the  true  and  estimated  flow  fields  is  more  obvious  here, 
especially  at  the  initial  time.  By  its  dynamical  nature,  vorticity  has  a  shorter  length- 
scale  of  variability  than  streamfunction.  Thus  more  convoluted  spatial  structure  is 
allowed  in  the  estimated  vorticity  fields.  The  sponge  layer  can  be  seen  to  be  damping 
the  vorticity  near  the  boundaries. 

In  figure  1.5  the  velocity  time  series  are  displayed  from  each  of  the  current 
meters  and  from  the  tomography,  for  the  true  flow  field  and  the  assimilated  flow  field 
for  case  Bl.  One  can  see  that  the  tomographic  measurement  is  indeed  an  average 
over  the  current  meters  in  its  path.  The  strongest  signal  in  the  data  seen  in  figure  1.5 
is  the  meander  event  between  day  16  and  day  24.  This  is  the  best  estimated  feature 
of  the  true  flow  field.  The  low  amplitude  flow  field  observed  at  earlier  times  is  not 
well  estimated  from  the  data. 

As  can  be  seen  in  figure  1.3,  the  estimated  initial  conditions  for  streamfunction 
look  quite  different  from  the  true  conditions.  The  uniform  westward  flow  over  the 
observing  array  in  the  true  flow  field  at  the  initial  time  is  not  well  constructed  in  the 
estimated  flow  field.  Apart  from  the  imposed  boundary  conditions,  the  estimated  flow 
does  not  resemble  the  true  flow  at  the  initial  time  because  there  are  no  observations 
at  the  initial  time  (the  first  observations  are  at  day  1). 
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Figure  1.3:  Streamfunction  every  two  days  for  (left)  the  true  flow  field  over  the  smaller 
domain,  (right)  Case  Bl:  the  estimated  flow  field,  the  dashed  line  marks  the  sponge  layer. 
(C.I.=5000  mVs) 


33 


Figure  1.4:  Vorticity  every  two  days  for  (left)  the  true  flow  field  over  the  smaller  domain, 
(right)  Case  Bl:  the  estimated  flow  field,  the  dashed  line  marks  the  sponge  layer.  (C.L= 
1  X  10"®  s-^) 
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North  Vel.  CM:  1-3-5 
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Figure  1.5:  Time  series  of  velocities  observed  by  current  meters  (left),  and  tomography 
(right).  True  velocities  are  solid,  estimated  velocities  from  case  B1  are  dashed.  The  a  priori 
observational  error  is  4  cmfs. 


Note  that  the  velocity  residuals  for  the  optimally  estimated  flow  field  are  some¬ 
what  larger  than  the  a  priori  error  for  the  velocity  observations,  that  is,  the  difference 
between  the  solid  and  dashed  curves  in  figure  1.5  is  generally  greater  than  0.04  m/sec. 
Because  the  residuals  cannot  be  reduced  to  the  level  of  the  observational  error,  the 
model  used  for  this  case  is  assessed  as  being  inconsistent  with  the  observations  from 
the  true  flow  field. 

Setting  streamfunction  and  vorticity  to  zero  at  all  times  on  the  northern  and 
southern  boundaries  is  not  a  good  choice  if  it  is  known  that  there  is  a  meandering  jet 
somewhere  to  the  south.  A  more  informed  choice  for  the  boundary  values  could  be 
made.  Case  Cl  is  the  same  as  case  B1  but  with  streamfunction  and  vorticity  on  the 
southern  boundary  set  to  a  priori  estimates  of  their  mean  values  from  the  true  model. 
This  is,  in  effect,  a  statement  that  there  is  a  zonal  character  to  the  flow,  and  that 
there  is  a  net  eastward  transport  between  the  northern  and  southern  boundaries. 
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The  true  and  estimated  streamfunction  fields  are  shown  in  figure  1.6.  It  is 
apparent  that  the  assimilation  does  not  do  much  better  with  this  model  configuration. 
The  meander  event  is  again  estimated  quite  well,  but  the  fiow  field  at  earlier  times 
is  not  estimated  well  in  regions  far  from  the  array.  A  net  eastward  transport  is 
inconsistent  with  the  broad  westward  flow  seen  in  the  true  flow  field  at  early  times. 
Also  setting  streamfunction  to  any  sort  of  constant  on  the  southern  boundary  of  the 
sub-domain  is  not  a  good  choice  because  in  the  true  flow  field  the  flow  crosses  this 
line. 

Case  D1  is  a  check  that  the  assimilation  scheme  performs  satisfactorily  when 
the  model  is  made  perfectly  consistent  with  the  data.  The  same  false  model  configura¬ 
tion  is  used  as  in  cases  B1  and  Cl,  except  that  the  boundary  values  of  streamfunction 
and  vorticity  on  all  four  edges  are  taken  directly  from  the  true  flow  field,  keeping  all 
of  their  spatial  and  temporal  variation.  Not  surprisingly  this  assimilation  does  best 
of  all,  as  can  be  seen  in  figure  1.7.  Note  that  for  this  case  the  CFvN  boundary  con¬ 
ditions  are  overspecified  in  that  vorticity  is  prescribed  at  outflow  boundary  points  in 
the  forward  model  integration,  so  that  formally  the  forward  model  is  ill-posed.  This 
is  not  a  problem  as  the  prescribed  vorticity  on  the  boundary  is  consistent  with  the 
vorticity  being  advected  to  the  boundary  from  the  interior. 

The  value  of  the  cost  function  as  the  iterations  of  the  forward  and  adjoint  mod¬ 
els  proceed  (see  appendix)  is  shown  in  figure  1.8.  The  differences  among  the  specified 
boundary  conditions  for  the  twin  experiments  is  summarized  in  table  1.1.  One  can 
see  convergence  to  the  minimum  for  each  of  the  cases.  When  further  iterations  do  not 
reduce  the  value  of  the  cost  function  significantly,  the  minimization  process  stops.  A 
successful  minimization  is  one  for  which  the  model/data  misfit  is  reduced  to  the  order 
of  the  observational  error,  at  which  point  the  cost  function  is  of  order  one.  Only  in 
cases  A1  and  D1  is  the  cost  function  successfully  minimized.  Case  A1  does  better 
than  cases  B1  or  Cl  because  its  observing  array  covers  the  jet  at  all  observing  times. 
The  estimated  vorticity  initial  conditions  for  case  A1  can  well  estimate  the  flow  at 


36 


Figure  1.6:  Same  as  figure  1.3  but  for  case  Cl  where  streamfunction  and  vorticity  on  the 
northern  and  southern  boundaries  are  kept  at  their  mean  values  from  the  true  flow  field. 


Figure  L8:  Value  of  cost  function  vs.  iteration  for  each  case:  A1  (solid),  B1  (dash), 
Cl  (dot),  D1  (dotdash). 

early  times,  whereas  in  cases  B1  and  Cl  the  vorticity  initial  conditions  are  varied  so 
as  to  reconstruct  the  meander  event  at  later  times. 

Cases  Bl,  Cl  and  Dl,  differ  only  in  the  choice  of  boundary  conditions;  the  cost 
function,  the  data  and  the  dynamics  stay  the  same.  It  is  apparent  from  the  results 
of  these  experiments  that  the  model/data  misfit  is  strongly  sensitive  to  the  specified 
boundary  conditions.  The  inclusion  of  a  sponge  layer  in  cases  Al,  Bl  and  Cl  does 


Experiment 

Specified  Boundary 

Conditions 

Al 

zero  at  NS,  periodic  at  EW 

Bl 

zero  at  NS,  periodic  at  EW 

Cl 

mean  values  from  true  field  at  NS,  periodic  at  EW 

Dl 

true  values 

Table  1.1:  The  specified  boundary  conditions  for  streamfunction  and  vorticity  for  each 
twin  experiment. 
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not  adequately  diminish  the  influence  of  the  specified  boundary  conditions  on  the 
model/data  misfit.  One  could  attempt  to  specify  time-dependent  boundary  condi¬ 
tions  that  are  more  similar  to  the  true  boundary  conditions.  However  the  intention 
here  is  to  estimate  streamfunction  and  vorticity  fields  from  available  data  only,  using 
the  dynamics  to  spread  the  influence  of  the  data  in  time  and  space. 

In  the  true  flow  field,  the  fluid  in  the  jet  that  is  observed  at  day  16,  with 
an  average  speed  of  50  cm/sec,  was  700  kilometers  (35  grid  points)  away  to  the 
west  at  day  0.  The  480  km  x  480  km  sub-domain  is  too  small  to  allow  such  large 
distances  of  fluid  travel,  over  the  duration  of  the  assimilation.  The  results  show  that 
a  strong  signal  such  as  the  meander  observed  by  the  array  at  later  times  is  fairly 
well  estimated,  but  the  meander  is  forced  by  the  specified  boundary  conditions  to  be 
contained  within  the  sub-domain  at  earlier  times.  The  presence  of  the  meander  in 
the  sub-domain  at  early  times  is  inconsistent  with  the  observed  velocities  at  early 
times,  thus  deteriorating  the  model/data  misfit. 

Thus  it  is  not  fruitful  to  try  to  spread  information  in  space  and  time  from 
the  observations,  when  the  specified  boundary  conditions  are  having  such  an  adverse 
effect.  The  aim  is  to  construct  a  flow  field  from  the  observations  that  is  large  enough 
to  capture  the  energetic  scales  of  motion,  while  being  adequately  controlled  by  the 
data.  Eventually  much  longer  assimilation  time  periods  than  are  used  here  will  be 
needed  to  study  the  real  flow  field.  It  is  impractical  to  lengthen  the  time  periods 
used  in  cases  Al,  B1  and  Cl  as  already  the  the  specified  boundary  conditions  are 
adversely  affecting  the  model/data  misfit  for  only  a  24  day  assimilation.  For  longer 
periods,  where  there  are  many  energetic  features  in  the  flow,  it  is  necessary  to  find  a 
better  way  of  treating  the  open  boundary  conditions. 
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Chapter  2 


Uncertainty  of  the  fit  to  data 


After  a  model  has  been  fit  to  data  it  is  desirable  to  assess  the  quality  of  the  fit,  that 
is  to  make  some  estimate  of  the  error  so  as  to  know  what  level  of  confidence  to  have 
in  the  estimates.  For  what  follows  it  is  assumed  that  one  is  at  the  stage  of  having 
found  the  minimum  of  the  cost  function  with  respect  to  a  vector  of  control  variables, 
and  that  one  is  in  possession  of  the  gradient  of  the  cost  function  with  respect  to  the 
control  variables  at  the  minimum. 

For  a  linear  model  the  error  covariance  matrix  of  the  estimated  control  variables 
is  given  by  the  inverse  of  the  Hessian  matrix  of  the  cost  function  with  respect  to  the 
control  variables.  For  a  nonlinear  model,  the  errors  are  also  given  by  the  inverse 
Hessian  as  long  as  the  cost  function  approximates  a  quadratic  at  the  minimum.  A 
separate  problem  associated  with  nonlinearity  is  that  the  minimum  might  not  be  a 
global  minimum. 

Thacker  (1989)  illustrated  the  equivalence  between  the  inverse  Hessian  and 
the  error  covariance  of  the  control  variables,  by  placing  the  minimization  of  the  cost 
function  in  the  probabalistic  context  of  maximum  likelihood  theory.  Here  the  equiv¬ 
alence  is  more  clearly  shown  by  looking  at  the  minimization  of  the  cost  function  in 
terms  of  the  Gauss— Markov  theorem. 
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The  importance  of  the  Hessian  in  the  minimization  of  the  cost  function  is 
then  presented.  Two  different  methods  of  computing  the  Hessian  are  presented,  and 
some  error  covariance  maps  for  the  estimated  vorticity  initial  conditions  are  shown. 
Measures  of  the  sensitivity  of  the  model  counterparts  of  the  data  to  perturbations  in 
the  estimated  vorticity  initial  conditions  are  then  shown. 

In  Thacker  (1989)  the  evaluation  of  the  error  covariance  matrix  from  the  inverse 
Hessian  was  presented  for  weakly  nonlinear  dynamics.  Furthermore,  the  large  size  of 
the  Hessian  for  most  oceanographic  problem  has  deterred  people  from  calculating  the 
full  error  covariance  matrix  from  the  inverse  Hessian,  and  previous  studies  have  only 
calculated  the  main  diagonal  elements  of  the  error  covariance  matrix  using  finite- 
differences  (Schiller,  1995;  Tziperman  &  Thacker,  1989).  Marotzke  &  Wunsch  (1993) 
discuss  the  estimation  of  uncertainties  from  main  diagonal  elements  of  the  Hessian 
using  finite— differences  for  a  nonlinear  ocean  model,  and  point  to  the  desirability  of 
knowledge  of  the  covariances  between  control  variables.  Here  for  the  first  time  the 
estimation  of  the  full  error  covariance  matrix  from  the  inverse  Hessian  when  the  model 
dynamics  are  strongly  nonlinear  is  investigated. 


2.1  The  Gauss-Mar kov  approach 

The  minimization  of  the  least-squares  cost  function,  described  in  section  1.2,  may  be 
investigated  using  the  Gauss— Markov  theorem  (Liebelt,  1967). 

The  Gauss-Markov  approach  is  as  follows;  To  make  an  estimate  of  the  control 
vector,  denoted  9,  that  is  as  close  as  possible  to  the  unknown  true  control  vector,  6, 
the  6  is  sought  that  minimizes  the  variance, 

p  =<{6  -e){e  -of  >,  (2.1) 

given  observations,  d,  that  are  related  to  the  control  variables  by  the  sampling  equa¬ 
tion  (1.2). 
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To  make  the  distinction  between  Pq  and  P  clear, 

6  is  the  unknown  true  values  of  the  control  variables 
Oq  is  an  a  priori  guess  of  their  expected  value 
0  is  an  estimate  of  the  true  values  using  the  data  d 
Po  is  an  a  priori  guess  of  the  dispersion  of  0  about  Oq 
P  is  the  dispersion  of  the  estimates  9  about  the  true  values  6 
An  explicit  expression  for  the  9  that  minimize  P  is  obtained  by  substituting 
the  form 

9  ^9o  +  B{d-  m(0o)) 

into  (2.1),  and  solving  for  the  unknown  matrix  operator,  B.  Using  the  sampling 
equation  (1.2)  and  assuming  that  the  data  noise,  n,  is  uncorrelated  with  the  control 
variables,  0,  ie.  that  <  n9  >=  0,  we  get 

P  =<  (B[m(6>)-m(6lo)  +  n]-[6>-0o])(B[m(6>)-m(6lo)  +  n]-[6>-0o]f  >  •  (2.2) 


Assuming  that  the  a  priori  guess  for  the  control  variables,  ^o,  is  sufficiently  close  to 
the  true  control  variables,  0,  such  that  m(0)  is  linear  in  the  neighbourhood  of  m(0), 
then 


m(6>)  -  m(6lo)  =  ~[e  -  Oq] 


and  (2.2)  becomes 

P  =  B 


+R 


de 


de  ) 


aO 


^de) 


B^  +  Po 


which  is  a  quadratic  matrix  equation  in  terms  of  B  that  can  be  written  as  ^  , 


G 


(2.3) 


^For  conformable  matrices  X,  P  and  Q, 

XPX^  -  QX'^  -  XQ'^  ={X-  QP-^)P{X  -  QP-^Y  -  QP-^Q'^. 
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where  G  =  “^Po  right-hand  side  of  (2.3)  is 

positive-definite  since  Pq  and  R  and  therefore  G  are  positive-definite.  Hence  P  can 
be  minimized  with  respect  to  B  when 


B  =  pj?^]  G-‘ 


Po 


'dm' 


5m 


Hence  from  (2.3) 

P 


/5m\^ 

""{Wj 


° '  55  j 


n  -1 


de 


+  R 


5m 


Po 


“  I,  a«  7  ee 


(2.4) 


using  the  matrix  inversion  lemma  ^  . 

The  cost  function  as  defined  in  section  1.2  can  be  written  as 

J  =  i(«  -  eofp^'ie  -  e„)  +  ~(m(«)  -  d)TR-i(m(9)  -  d)  (2.5) 

then  its  Hessian  is  given  by 

,_i  .  fdmY^_^dm  _  d^m^_. 


H  =  Po  ^  + 


55  ) 


R 


+  ^R-\mie)  -  d). 


(2.6) 


55  55" 

Hence  the  error  covariance  matrix  of  the  control  variables  defined  by  equation  (2,4) 
is  given  by  the  inverse  of  the  Hessian  as  long  as  the  nonlinear  term,  the  third  term 
on  the  right  hand  side  of  equation  (2.6),  is  negligible,  that  is 


P  = 


.d0\ 


The  equivalence  between  the  inverse  Hessian  and  the  estimated  error  covari¬ 
ance  of  the  control  variables,  depends  upon  the  degree  of  linearity  in  the  model.  The 


^  ^A-\-C  ^)  ^  =  C'-CA'^  (^ACA'^  +  B)  ^  AC  where  the  inverses  exist,  see  Liebelt  (1967) 

for  details. 
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estimated  error  covariance  of  the  estimated  control  variables  is  well  approximated 
by  the  inverse  Hessian,  as  long  as  rn(0)  is  linear  in  the  vicinity  of  the  minimum  of 
the  cost  function,  J.  Hopefully  a  nonlinear  model  may  be  regarded  as  linear  in  a 
sufficiently  small  neighborhood  of  the  minimum  of  the  cost  function. 

2.2  Scaling  of  the  control  variables 

The  Hessian  matrix  of  second  derivatives  expresses  the  curvature  of  the  cost  function 
in  the  vector  space  spanned  by  the  control  variables.  A  high  curvature  in  a  certain 
direction  means  that  the  cost  is  strongly  sensitive  to  variations  in  the  linear  com¬ 
bination  of  control  variables  that  lie  in  that  direction.  For  example,  for  the  cases 
presented  in  the  last  chapter  a  velocity  observation  at  day  1  is  strongly  sensitive  to 
the  vorticity  initial  conditions  at  grid  points  near  the  current  meter  location. 

Conversely,  control  variables  that  have  little  or  no  influence  on  the  cost  corre¬ 
spond  to  directions  in  control-space  with  very  flat  curvature.  These  produce  ‘Valleys” 
in  the  cost  function  that  can  slow  down  the  gradient-search  minimization.  The  eigen¬ 
values  of  the  Hessian  matrix  associated  with  these  flat  directions  are  very  small  which 
can  render  the  Hessian  ill-conditioned  and  difficult  to  invert.  The  presence  of  terms 
in  the  cost  function  expressing  the  size  and  smoothness  of  the  control  variables,  pro¬ 
vides  some  curvature  in  all  directions  of  control-space,  thus  rendering  the  Hessian 
more  well-conditioned. 

How  the  cost  function  is  minimized  with  respect  to  the  control  variables  using 
the  adjoint  method,  is  described  in  detail  in  the  appendix.  The  limited-memory 
quasi-Newton  method  that  is  used  as  a  minimization  algorithm  here,  constructs  an 
approximation  to  the  Hessian  matrix  of  the  cost  function  at  each  iteration.  The 
minimization  will  converge  to  a  minimum  in  fewer  iterations  if  the  Hessian  is  well- 
conditioned.  For  more  information  on  the  above  points,  and  the  importance  of  the 
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Hessian  matrix  in  minimization  problems,  the  reader  is  referred  to  Thacker,  (1989), 
and  Gill  et  ai,  (1981). 

By  scaling  the  control  vaxiables  the  Hessian  can  be  rendered  well-conditioned. 
The  ideal  scaling  would  be  that  which  transforms  the  Hessian  matrix  into  the  identity 
matrix.  If  the  form  of  the  Hessian  were  known  beforehand,  one  could  perform  the 
following  scaling, 

=  <  {d  -  e){e  -  6)“^  > 

H?H-'(H?)^  =  > 

I  =  <  (z  -  z)(z  -  z)^  > 

where  z  =  H2  0  and  z  =  This  scaling  can  rarely  be  done  as  the  form  of  the 

Hessian  is  usually  not  known  before  the  minimization. 

From  the  expression  for  the  Hessian  given  by  equation  (2.6),  one  can  see  that 
Pg^  is  an  a  priori  estimate  for  the  Hessian.  In  section  1.2,  Pg^  was  defined  as  the 
sum  of  the  weighting  matrices  in  the  cost  function  for  size  and  smoothness  of  the 
control  vector.  One  would  like  the  scaling  to  be  a  vector,  so  a  practical  choice  for 
the  scaling  is  for  z  =  Rg/0.  For  the  experiments  of  chapter  one,  Rgg  was  a  diagonal 
matrix  with  the  expected  variance  of  the  vorticity  initial  conditions,  <  icm  > 
the  main  diagonal.  Hence  the  size  of  the  scaled  vorticity  initial  conditions  were  all 
of  order  one.  For  the  experiments  of  chapter  three,  we  shall  see  how  important  the 
choice  of  scaling  is  for  the  success  of  the  assimilation. 

2.3  Computing  the  error  covariance  matrix 

For  many  optimization  problems  in  oceanography,  computing  the  Hessian  is  pro¬ 
hibitive  because  of  its  large  dimension  (e.g.,  Marotzke  &  Wunsch,  1993;  Schiller, 
1995).  The  Hessian  has  NxN  elements  where  N  is  the  number  of  control  variables, 
and  can  often  be  too  large  for  inversion  using  available  computing  resources.  This  is 
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not  a  problem  for  the  cases  presented  here  where  there  are  0(1000)  control  variables. 
One  can  readily  invert  the  Hessian,  as  long  as  it  is  well-conditioned. 

In  this  section  are  presented  two  methods  of  computing  the  error  covariance 
matrix  given  by  the  inverse  Hessian  of  the  cost  function  (2.5).  These  methods  are 
applicable  when  the  adjoint  method  has  been  used  to  find  the  minimum  of  the  cost 
function. 


2.3.1  Finite  differences 


At  the  end  of  the  minimization  using  the  adjoint  method,  one  has  the  optimally 
estimated  control  variables,  denoted  0,  and  the  gradient  of  the  cost  function  with 
respect  to  the  control  variables  at  the  optimal  estimate,  |^(^)-  In  a  perfect  world  this 
latter  term  goes  to  zero  in  the  course  of  the  minimization.  In  practice  the  minimum 
is  found  by  iteration,  and  the  minimization  stops  when  the  norm  of  the  gradient  falls 
below  some  small  value.  The  small  finite  values  left  in  the  gradient  vector  at  the  end 
of  the  minimization  correspond  to  changes  in  the  control  variables  that  cause  very 
minor  differences  to  the  estimated  flow  field  (Wang  et  al.^  1992). 

To  calculate  the  Hessian  matrix  of  second  derivatives,  each  control  variable  is 
perturbed  by  a  small  amount,  then  the  gradient  of  the  cost  function  is  calculated  us¬ 
ing  the  adjoint  model  at  the  perturbed  point  in  N-space.  The  unperturbed  gradient 
is  subtracted  from  the  perturbed  gradient,  and  this  difference  is  divided  by  the  mag¬ 
nitude  of  the  perturbation  to  give  a  finite-difference  approximation  to  the  Hessian, 


le. 


H 


1 

~Ke 


The  ijih.  element  of  H  is 


3  L 


This  requires  the  gradient  to  be  calculated  as  many  times  as  there  are  model  param¬ 
eters,  but  one  can  use  the  existing  adjoint  model  code. 
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Caution  must  be  exercised  in  determining  the  correct  size  of  the  perturbation  so 
as  to  minimize  truncation  error  and  machine  error.  These  two  errors  are  proportional 
to  the  size  of  the  perturbation  in  opposing  ways:  (i)  If  AO  is  too  large  the  finite- 
difference  approximation  is  poor  due  to  the  truncation  of  the  Taylor  series,  and  the 
error  in  the  approximation  is  proportional  to  0{A9)',  (ii)  If  AO  is  too  small  the 
subtraction  on  the  right-hand  side  of  the  above  equation  will  be  between  terms  that 
are  the  same  to  a  high  number  of  significant  figures,  and  because  the  computer  has  a 
limited  number  of  significant  figures  available  (16  for  the  double  precision  FORTRAN 
used  here),  accuracy  is  lost  and  the  error  is  proportional  to  0(AO  ^).  See  Gill  et  al., 
1981,  for  further  details. 

A  variety  of  perturbation  sizes  were  employed  to  find  the  optimal  step-length, 
AO,  for  which  the  computed  second  derivatives  were  stable. 

The  aim  is  to  construct  the  error  covariance  matrix  given  by  equation  (2.4). 
Thus  a  disadvantage  of  this  method  is  that  the  computed  Hessian  will  contain  the 
nonlinear  term,  that  is  the  third  term  on  the  right-hand  side  of  equation  (2.6).  If  the 
size  of  this  nonlinear  contribution  is  significant  the  inverse  of  the  computed  Hessian 
will  not  be  a  good  approximation  to  the  error  covariance.  The  first  two  terms  in 
(2.6)  are  positive  definite  and  hence  well— conditioned;  should  the  third  term,  which 
is  indefinite,  be  significant  it  could  render  H  ill-conditioned  and  not  invertible 

2.3.2  Sensitivity  matrix 

This  method  of  computing  the  Hessian  is  taken  from  Thacker  (1989).  The  nonlinear 
term  in  the  expression  for  the  Hessian  (2.6)  is  ignored  in  this  calculation,  as  without 
it  the  inverse  Hessian  is  equivalent  to  the  error  covariance  in  the  estimated  control 
variables.  All  that  is  then  needed  to  compute  the  Hessian  is  the  calculation  of  the 
so-called  sensitivity  matrix 

For  M  data  points  and  N  control  variables,  m  and  d  are  vectors  of  dimension 
M,  0  has  dimension  N,  and  the  sensitivity  matrix  has  dimension  MxN.  Each  row  of 
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the  sensitivity  matrix  is  a  measure  of  the  sensitivity  of  a  particular  model  counterpart 
of  a  datum  to  variations  in  all  of  the  control  variables.  The  ijth  entry  is  the  gradient 
of  the  model  counterpart  of  the  jih  datum  with  respect  to  the  zth  control  variable. 
In  section  2.5  the  structure  of  the  sensitivity  matrix  is  explored  in  more  detail.  The 
sensitivity  matrix  is  readily  calculated  using  the  forward  and  adjoint  models,  by  the 
following  procedure: 

(i)  The  cost  function  J  in  the  adjoint  model  code  is  replaced  by  the  model 
counterpart  of  the  jth  datum,  mj(6). 

(ii)  The  adjoint  model  is  run  using  the  optimal  control  variables  6  previously 
estimated. 

(iii)  The  Lagrange  multipliers  then  give  the  gradient  of  rrij  with  respect  to 
which  is  the  jih  row  of  the  sensitivity  matrix. 

(i)  to  (iii)  are  performed  over  all  M  data  to  compute  the  sensitivity  matrix  row 
by  row.  The  Hessian  is  then  calculated  by  equation  (2.6)  without  the  nonlinear  term. 

The  existing  adjoint  model  code,  that  was  used  to  find  the  gradient  of  the  cost 
function  with  respect  to  the  vorticity  initial  conditions,  needs  to  be  slightly  modified 
to  accommodate  the  different  forcing  function  in  the  adjoint  equations  due  to  the  cost 
function  being  defined  differently.  With  this  method  the  adjoint  model  only  needs  to 
be  run  M  times.  This  is  an  immediate  advantage  over  the  finite-difference  method  if 
there  are  more  control  variables  than  data  (M<N),  as  is  often  the  case. 

Neglecting  the  nonlinear  term  in  (2.6)  has  two  benefits.  The  first  is  that 
the  inverse  Hessian  will  be  equivalent  to  the  control  variable  error  covariance  as 
expressed  in  equation  (2.4),  which  is  the  desired  quantity.  The  second  is  that  the  first 
two  matrices  on  the  right-hand  side  of  (2.6)  are  positive-definite  whereas  the  third 
nonlinear  term  is  not,  so  the  inverse  of  H  will  be  well-conditioned. 

If  the  nonlinearities  in  the  model  are  weak,  both  of  the  methods  described 
here  should  produce  the  same  estimates  of  the  Hessian.  However  the  finite-difference 
method  is  more  prone  to  errors  arising  from  the  size  of  the  step-length,  which  could 
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deteriorate  the  conditioning  of  its  computed  Hessian.  The  difference  between  the  two 
methods  lies  in  the  nonlinear  term,  which  is  present  in  the  finite-difference  form  but 
neglected  in  the  sensitivity  matrix  form.  The  nonlinear  term  is  important  only  in  so 
far  as  it  can  show  how  bad  an  assumption  of  linearity  near  the  minimum  may  be.  In 
the  end,  the  nonlinear  term  is  ignored  when  the  error  covariance  matrix  is  equated 
with  the  inverse  Hessian. 

Another  method  of  computing  the  Hessian  is  given  by  Wang  et  a/.,  (1992). 
They  construct  a  second  order  adjoint  model  using  the  tangent-linear  formulation 
of  the  adjoint  model  (see  Talagrand  &  Courtier,  1987).  Through  one  iteration  of 
their  second  order  adjoint  model,  the  Lagrange  multipliers  give  the  product  of  the 
Hessian  and  a  vector  of  perturbations.  Thus  through  N  iterations  the  Hessian  can  be 
constructed  column  by  column.  As  with  the  sensitivity  matrix  method,  the  forward 
model  is  linearized  about  the  optimally  estimated  control  variables.  Theoretically,  the 
second  order  adjoint  should  give  the  same  results  as  the  sensitivity  matrix  method, 
however  it  takes  N  iterations  instead  of  M  iterations  to  compute  the  Hessian.  An 
advantage  of  the  second  order  adjoint  method  is  that  the  condition  number  of  the 
Hessian  is  available  at  each  iteration,  which  reflects  the  convergence  rate  of  the  mini¬ 
mization  and  provides  information  on  the  scale  of  changes  being  made  to  the  control 
variables  as  the  minimization  progresses  (see  Navon  et  al.^  1992). 

2.4  Estimated  uncertainty  in  the  initial  condi¬ 
tions 

The  error  covariance  matrix  of  the  optimally  estimated  vorticity  initial  conditions  is 
estimated  for  two  different  experiments;  case  B1  from  chapter  one  which  uses  strongly 
nonlinear  dynamics,  and  the  same  experiment  but  using  more  linear  dynamics.  The 
two  different  experiments  are  performed  so  as  to  investigate  the  effect  that  nonlinear¬ 
ity  has  on  the  assimilation,  in  particular  on  the  calculation  of  the  inverse  Hessian. 
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Recall  that  the  error  in  the  vorticity  initial  conditions  as  presented  here  arises 
only  from  the  model/data  misfit  terms  in  the  cost  function.  The  boundary  values  of 
streamfunction  and  vorticity  are  treated  as  correct  in  the  assimilation,  and  are  not  an 
explicit  source  of  error  in  the  calculation  of  the  error  covariance  of  the  vorticity  initial 
conditions.  The  imposed  boundary  conditions  are  of  course  a  source  of  error  in  that 
they  prevent  the  cost  function  from  being  minimized  to  a  lower  value.  Since  their 
values  have  no  dependence  on  the  control  variables,  they  cannot  have  an  estimated 
uncertainty  arising  from  the  estimated  uncertainty  in  the  control  variables. 

2.4.1  Using  strongly  nonlinear  dynamics 

For  the  cases  presented  in  chapter  one,  the  advective  acceleration  dominates  the  time 
evolution  of  the  vorticity  field.  In  fact,  =  oo  since  {3  is  set  to  zero,  and  ^  =  7, 
where  U  =  1  m/s  is  the  max:imum  speed  in  the  jet,  i/  =  50  km  is  its  length-scale  of 
variation,  and  T  =  4  days  is  the  time-scale.  A  fluid  particle  travels  1000  kilometers 
in  24  days,  so  much  of  the  flow  seen  by  the  observing  array  originates  from  outside 
of  the  boundaries  of  the  false  model  domain. 

Using  the  optimal  initial  conditions  estimated  in  case  Bl,  chapter  one,  the  Hes¬ 
sian  is  calculated  using  both  of  the  methods  described  in  the  previous  section.  Both 
methods  produce  Hessian  matrices  that  are  the  same  to  a  high  order.  The  root-mean- 
square  difference  between  the  two  Hessian  matrices  divided  by  the  root-mean-square 
average  of  the  two  Hessian  matrices  is  0.07.  However  only  the  Hessian  calculated  by 
the  sensitivity  matrix  method  is  well-conditioned  (low  condition  number)  and  hence 
readily  invertible. 

The  condition  number  (ratio  of  the  largest  eigenvalue  to  the  smallest  eigen¬ 
value)  of  the  finite-difference  Hessian  is  very  large.  This  may  be  due  to  either  the 
nonlinear  contribution  from  multiplied  by  the  residual  (m(0)  —  d),  or  the  trunca¬ 
tion  error  from  using  finite  differences  to  calculate  the  second  derivative  from  the  first 
derivative,  or  both.  The  former  source  of  ill-conditioning  occurs  when  a  datum  is  a 
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Figure  2.1:  Relative  error  in  the  estimated  vorticity  initial  conditions  for  the  strongly 
nonlinear  case  (Bl). 


strongly  nonlinear  function  of  the  control  variables,  ie.  when  a  velocity  observation  at 
some  late  time  depends  nonlinearly  upon  the  vorticity  initial  condition.  In  this  case 
the  shape  of  the  cost  function  in  the  vicinity  of  the  minimum  need  no  longer  approx¬ 
imate  a  quadratic,  so  the  Hessian  may  not  have  all  positive  eigenvalues.  The  latter 
source  of  ill-conditioning  arises  when  a  control  variable,  6i,  has  very  little  influence 
over  any  of  the  data,  then  ^  ~  0  and  the  second  derivative  of  J  in  the  ith  direction 
may  be  very  small.  Hence  the  ith  eigenvalue  of  the  Hessian  may  be  very  small. 

In  figure  2.1  is  shown  the  relative  error  in  the  estimated  vorticity  initial  con¬ 


ditions,  defined  by 


relative  error 


diag{H-^) 

|Po| 


This  ratio  of  the  a  posteriori  to  the  a  priori  expected  variance,  is  a  measure  of  how 


well  the  assimilation  of  the  available  data  has  improved  the  estimated  uncertainty  in 


the  vorticity  initial  conditions. 
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As  one  would  expect  the  relative  error  is  lowest  in  the  vicinity  of  the  observation 
locations.  The  relative  error  is  slightly  lower  to  the  southwest  of  the  array.  The 
observing  array  mainly  sees  fluid  moving  into  the  array  from  the  southwest,  so  the 
model/data  misfit  is  quite  sensitive  to  the  vorticity  initial  conditions  to  the  south¬ 
west,  which  are  therefore  estimated  with  greater  accuracy. 

2.4.2  Using  weakly  nonlinear  dynamics 

The  dynamics  of  the  flow  field  are  rendered  more  linear  by  reducing  the  size  of  the 
ratio  of  relative  vorticity  gradient  to  planetary  vorticity  gradient.  That  is,  ^  =  0.1, 
by  setting  the  maximum  speed  in  the  jet  to  U  =  0.02  m/s,  with  jS  =  1.8  x  10~^^ 
and  length— scale  of  variation,  L  =  100  km.  A  fluid  particle  travels  40 
kilometers  in  24  days,  which  is  two  grid  points.  The  flow  is  now  sluggish  enough  to 
feel  the  ambient  planetary  vorticity  gradient  and  the  flow  can  be  described  by  Rossby 
waves. 

An  assimilation  is  performed  using  the  same  experimental  set-up  as  for  case 
Bl,  but  with  the  above  changes  to  U  and  /3  and  L.  In  figure  2.2  is  shown  the  true  and 
estimated  streamfunction  fields  (note  that  a  smaller  contour  interval  is  used  here). 
For  this  case  the  cost  function  is  successfully  minimized  in  many  fewer  iterations  than 
in  the  nonlinear  cases. 

The  Hessian  is  calculated  using  both  of  the  methods  described  in  the  previous 
section,  and  with  both  methods  the  Hessians  are  well-conditioned  and  their  inverses 
are  the  same.  In  figure  2.3  is  shown  the  relative  error.  The  relative  error  is  lowest 
in  the  vicinity  of  the  observing  array,  as  was  found  with  the  nonlinear  case.  However 
the  low  relative  error  does  not  extend  away  for  any  great  distance  from  the  observing 
array,  as  it  does  in  the  strongly  nonlinear  case,  because  the  fluid  is  moving  relatively 
slowly,  so  that  over  the  time  of  the  assimilation  the  observing  array  only  sees  the  fluid 
that  is  nearby. 
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Figure  2,3:  Relative  error  of  the  initial  conditions  for  the  weakly  nonlinear  case. 

In  the  previous  section  where  strongly  nonlinear  dynamics  are  used,  the  finite- 
difference  Hessian  is  very  similar  to  the  sensitivity  matrix  Hessian,  but  ill-conditioned. 
For  linear  dynamics  the  finite-difference  Hessian  is  found  to  be  well-conditioned. 
Hence  the  source  of  ill-conditioning  for  the  finite-difference  Hessian  in  the  nonlinear 
case  is  probably  not  due  to  truncation  errors  in  the  finite-difference  computation. 

These  results  suggest  that  even  when  the  flow  is  strongly  nonlinear,  the  nonlin¬ 
ear  term  —  d)  in  the  expression  for  the  Hessian  (2.6)  is  small  since  the 

du 

finite-difference  Hessian  is  very  similar  to  the  sensitivity  matrix  Hessian.  However 
it  is  big  enough  to  ill-condition  the  finite-difference  Hessian.  The  significant  size  of 
the  nonlinear  term  is  probably  due  to  the  residuals  (m(0)  —  d)  not  being  small.  The 
model/data  residuals  for  the  estimated  control  variables  for  case  B1  are  not  small  as 
was  seen  from  the  true  and  estimated  velocity  time  series  in  figure  1.5  and  the  final 
size  of  the  cost  function  shown  in  figure  1.8. 
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For  both  strongly  nonlinear  dynamics  and  weakly  nonlinear  dynamics  the  rel¬ 
ative  error  goes  to  one  at  the  boundaries  of  the  model  domain.  In  effect  this  is  a 
statement  that  the  assimilation  of  observations  cannot  improve  the  a  •priori  estimate 
of  the  vorticity  initial  conditions  near  the  boundary,  because  the  boundary  values  of 
vorticity  are  kept  fixed. 

The  relative  error  maps  give  only  a  partial  measure  of  the  uncertainty  in  the 
estimated  vorticity  initial  conditions.  The  sources  of  the  errors  estimated  using  the 
inverse  Hessian  are  the  errors  in  the  observations  and  the  a  priori  errors  in  the  vortic¬ 
ity  initial  conditions,  both  of  which  are  treated  as  random  statistical  errors.  However 
there  are  also  systematic  errors  in  the  assimilation  that  are  not  included  in  the  esti¬ 
mated  error  in  the  vorticity  initial  conditions.  The  most  important  systematic  error 
is  the  specification  of  boundary  conditions  as  performed  in  cases  Al,  B1  and  Cl. 

It  was  apparent  from  the  results  shown  in  chapter  one  that  the  specified  bound¬ 
ary  conditions  were  a  source  of  error  in  the  optimization  because  they  prevented  the 
cost  function  from  being  minimized  successfully  for  cases  B1  and  Cl.  Hence  the 
relative  error  maps  are  to  be  interpreted  carefully.  They  are  the  errors  in  the  vor¬ 
ticity  initial  conditions  given  a  certain  set  of  imposed  boundary  conditions  which  are 
regarded  as  perfectly  known. 

2.4.3  OfF-diagonal  structure 

As  the  full  error  covariance  matrix  is  calculated  here,  one  can  examine  its  off-diagonal 
structure.  Previous  model/data  assimilation  studies  using  the  adjoint  method,  have 
estimated  the  uncertainties  in  the  control  variables  by  calculating  elements  of  the 
main  diagonal  of  the  Hessian  using  finite-differences  (Schiller,  1995).  The  inverse 
of  each  element  is  an  approximation  to  the  error  covariance,  assuming  that  the  off- 
diagonal  elements  of  the  Hessian  are  negligible.  The  availability  of  the  full  covariance 
matrix  allows  a  comparison  to  test  the  validity  of  this  assumption. 
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Figure  2.4:  Relative  error  of  the  vorticity  initial  conditions  for  the  strongly  nonlinear  case 
(Bl),  where  all  off-diagonal  structure  in  the  Hessian  has  been  removed  before  inversion. 

In  figure  2.4  the  relative  error  for  the  vorticity  initial  conditions  from  case  Bl 
is  shown,  where  all  off-diagonal  structure  in  the  Hessian  has  been  discarded.  That 
is,  the  error  covariance  matrix  is  a  diagonal  matrix  consisting  of  the  inverses  of  each 
element  of  the  diagonal  of  the  Hessian.  Figure  2.4  may  be  compared  to  figure  2.1 
where  the  inverse  Hessian  kept  all  of  its  off-diagonal  structure.  The  relative  error  is 
larger  over  all  of  the  sub-domain  in  figure  2.4.  This  result  implies  that  information 
about  spatial  correlations  among  the  vorticity  initial  conditions  is  lost  when  the  off- 
diagonal  structure  is  ignored,  resulting  in  larger  estimated  errors.  The  strong  east- 
west  symmetry  seen  in  2.1  and  not  in  figure  2.4  is  due  to  the  off-diagonal  terms  from 
the  Laplacian  smoothing. 

An  example  of  the  off-diagonal  structure  of  the  error  covariance  matrix  is 
shown  in  figure  2.5,  so  as  to  illustrate  the  effect  that  the  assimilation  of  data  has  on 
the  estimated  covariances  among  the  vorticity  initial  conditions.  The  78th  column  of 
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Figure  2.5:  The  78th  column  of  the  error  covariance  matrix  for  the  strongly  nonlinear 
case  (Bl).  The  covariance  of  the  inital  vorticity  at  i  =  6,  j  =  5  with  the  vorticity  at  aU 
other  grid-points  at  the  initial  time  is  shown.  The  lower  plot  shows  the  a  priori  covariance 
taken  from  Pq,  the  upper  plot  shows  the  a  posteriori  estimated  covariance.  Note  the  factor 
of  ten  difference  in  the  vertical  axis. 
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the  error  covariance  matrix  for  case  B1  is  displayed,  which  corresponds  to  the  initial 
vorticity  at  grid-point  i,j  =  (6, 5)  located  in  the  south-west  of  the  sub-domain.  The 
a  priori  covariance  is  compared  to  the  a  posteriori  covariance.  The  former  is  the 
78th  column  of  Pq,  and  the  latter  is  the  78th  column  of  the  error  covariance  matrix 
calculated  using  the  sensitivity  matrix  method. 

The  a  posteriori  covariances  between  the  initial  vorticity  at  grid-point  i,j  = 
(6, 5)  and  all  other  grid-points  is  less  than  ten  percent  of  the  a  priori  covariances. 
The  assimilation  of  data  has  decreased  the  size  of  the  spatial  covariances  among  the 
vorticity  initial  conditions.  Furthermore,  the  gaussian  shape  of  the  smoothing  term 
in  the  a  priori  covariance  (see  section  1.2)  has  been  narrowed  by  the  assimilation. 
There  is  an  inherent  length-scale  for  vorticity  contained  in  the  model  dynamics  that 
is  shorter  than  the  length-scale  associated  with  the  Laplacian  smoothing  in  the  a 
priori  covariance  (see  figures  1.1  and  1.4).  The  estimated  length-scale  of  the  spatial 
covariances  among  the  vorticity  initial  conditions  is  thus  shortened  by  the  action  of 
the  model  dynamics  during  the  assimilation. 

2.5  Sensitivity 

An  interesting  feature  of  the  sensitivity  matrix,  introduced  in  section  2.3.2,  is  that 
it  contains  the  gradients  of  the  model  counterparts  of  the  data  with  respect  to  the 
control  variables.  For  the  procedure  outlined  in  section  2.3.2,  mk{0)  is  the  model 
counterpart  of  the  kth.  observation  which  could  be  say,  eastward  velocity  taken  at 
grid-point  {i,j)  at  time  t.  It  is  given  by 

-  C+i) 

2dy 

Here  6  is  the  vorticity  initial  conditions,  upon  which  the  streamfunction,  ■0,  depends 
at  future  times  over  the  grid,  and  the  subscript  k  refers  to  the  ordering  of  the  data. 

For  the  calculation  of  the  Hessian,  mk{6)  replaces  equation  (1.5)  as  the  def¬ 
inition  of  the  cost  function.  The  same  adjoint  model  equations  as  before  are  forced 
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using  this  cost  function,  as  described  in  the  appendix.  The  forcing  term  for  the  ad¬ 
joint  model  equations  is  §^,  which  is  non-zero  at  only  two  points  on  the  model 
grid, 

dmk  _  _J_ 
dipij-i  ‘^dy 
drrik  _  1 

Thus  the  adjoint  model  runs  backward  in  time  forced  by  an  impulse  at  the  location 
and  time  of  the  observation.  The  forcing  term  can  be  interpreted  as  the  sensitivity 
of  the  eastward  velocity  at  grid-point  {i,j)  at  time  t,  to  the  streamfunction  over  all 
the  grid  at  time  t. 

The  adjoint  equations  propagate  the  impulses  at  these  two  points  backward 
in  time.  When  t  =  0  is  reached  the  Lagrange  multipliers  give  the  gradient  of  rrik 
with  respect  to  the  vorticity  initial  conditions,  which  constitutes  the  kth.  row  of  the 
sensitivity  matrix.  One  could  say  that  the  sensitivity  matrix  is  the  Green’s  function 
response  of  the  adjoint  model  to  the  spatio-temporal  distribution  of  the  data. 

A  particular  row  of  the  sensitivity  matrix  may  be  displayed  to  show  the  sen¬ 
sitivity  that  a  particular  model  counterpart  of  a  datum  has  to  the  vorticity  initial 
conditions.  For  the  weakly  nonlinear  case,  figure  2.6  shows  the  first  row  of  the  sensi¬ 
tivity  matrix.  This  is  the  sensitivity,  with  respect  to  the  vorticity  initial  conditions, 
of  the  eastward  velocity  produced  by  the  forward  model  at  day  1  at  current  meter  lo¬ 
cation  1,  starting  from  the  optimally  estimated  initial  conditions.  Since  the  eastward 
velocity  is  taken  only  two  days  after  the  initial  time,  this  datum  is  most  sensitive  to 
the  vorticity  initial  conditions  in  the  immediate  vicinity  of  the  current  meter  location. 

For  an  eastward  velocity  observation  at  the  same  location  at  a  later  time, 
t  —  2Z  days,  the  influence  of  the  vorticity  initial  conditions  upon  the  observation  has 
spread  in  space,  as  shown  in  figure  2.7.  The  region  of  greatest  sensitivity  is  about  two 
grid-points  to  the  east  of  the  current  meter  location.  The  slow  westward  flow  in  this 
region  has,  after  23  days,  advected  the  vorticity  initial  conditions  past  the  current 
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Figure  2.6:  Sensitivity  of  the  eastward  velocity  recorded  at  current  meter  1  at  day  1  to  the 
vorticity  initial  conditions  for  the  weakly  nonlinear  case,  (row  1  of  the  sensitivity  matrix). 


Figure  2.7:  Sensitivity  of  the  eastward  velocity  recorded  at  current  meter  1  at  day  23 
to  the  vorticity  initial  conditions  for  the  weakly  nonlinear  case,  (row  133  of  the  sensitivity 
matrix). 
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meter  location.  Also  the  pattern  of  the  sensitivity  away  from  the  current  meter 
location  is  indicative  of  the  spread  of  information  by  Rossby  waves.  The  average  flow 
speed  at  this  location  is  0.3  cm/5,  but  the  phase  speed  of  a  barotropic  Rossby  wave 
is  about  7  cm/s  in  this  setting.  This  is  the  phase  speed  for  a  wavelength  of  400  km, 
which  is  approximately  the  wavelength  of  the  flow  seen  in  figure  2.2.  In  23  days  such 
a  wave  travels  145  km,  about  seven  grid  points,  which  can  be  seen  in  figure  2.7  as 
the  range  of  influence  of  the  estimated  vorticity  initial  conditions  on  the  datum. 

For  the  strongly  nonlinear  case  the  picture  is  more  convoluted.  As  can  be 
seen  in  figure  1.3  the  optimally  estimated  flow  field  is  quite  turbulent  relative  to  the 
weakly  nonlinear  case.  In  figures  2.8  and  2.9  the  sensitivities  of  the  eastward  velocity 
from  current  meter  1  at  1  day  and  at  23  days  to  the  vorticity  initial  conditions  are 
shown.  Similar  to  the  weakly  nonlinear  case,  the  datum  at  1  day  is  most  sensitive  to 
the  vorticity  initial  conditions  in  its  immediate  vicinity,  but  at  23  days  the  picture  is 
quite  different.  The  strong  advective  accelerations  vigorously  stir  the  initial  vorticity 
field,  causing  the  influence  of  the  vorticity  initial  conditions  upon  the  observed  velocity 
to  spread  rapidly  throughout  the  model  domain. 
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t  =  1  day 


Figure  2.8:  Sensitivity  of  the  eastward  velocity  recorded  at  current  meter  1  at  day 
1  to  the  vorticity  initial  conditions  for  the  strongly  nonlinear  case,  (row  1  of  the 
sensitivity  matrix). 


Figure  2.9:  Sensitivity  of  the  eastward  velocity  recorded  at  current  meter  1  at  day 
23  to  the  vorticity  initial  conditions  for  the  strongly  nonlinear  case,  (row  133  of  the 
sensitivity  matrix). 
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Chapter  3 


Optimal  estimation  of  boundary 
conditions 


The  results  of  the  twin  experiments  performed  in  chapter  one  showed  that  the  esti¬ 
mated  flow  field  throughout  the  interior  is  strongly  influenced  by  the  specified  bound¬ 
ary  conditions.  Much  of  the  flow  observed  in  the  true  flow  field  originates  far  from 
the  observing  array,  from  a  distance  that  is  much  greater  than  the  distance  from 
the  observing  array  to  the  boundaries  of  the  sub— domain.  Given  a  considerable  lack 
of  knowledge  about  the  true  time-dependent  values  of  streamfunction  and  vortic- 
ity  around  the  boundaries  of  the  sub— domain,  it  becomes  necessary  to  include  the 
boundary  conditions  in  the  control  variables. 

The  twin  experiments  of  chapter  one  are  now  further  developed  to  include  the 
estimation  of  boundary  conditions  as  well  as  initial  conditions.  The  control  variables 
are  now  the  time-dependent  boundary  values  of  streamfunction  and  vorticity,  and 
the  initial  values  of  vorticity  at  interior  points  of  the  model  domain.  This  set  of 
variables  is  sufficient  to  integrate  the  flow  field  forward  in  time  (as  was  mentioned  in 
chapter  one).  That  is,  all  values  of  streamfunction  and  vorticity  in  space  and  time  are 
determined  by  the  control  vector.  In  the  experiments  of  chapter  one,  only  the  initial 
conditions  were  used  as  control  variables,  and  the  boundary  values  were  imposed  and 
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not  varied  in  the  course  of  the  assimilation.  The  imposed  values  were  based  upon 
any  a  priori  knowledge  of  the  true  flow  field.  Given  a  lack  of  a  priori  knowledge,  the 
assimilation  experiments  are  improved  by  including  the  boundary  values  as  control 
variables. 

In  order  to  use  the  adjoint  method  to  find  the  optimal  control  variables  that 
minimize  a  cost  function,  expressions  are  needed  for  the  gradient  of  the  cost  function 
with  respect  to  the  control  variables.  The  gradient  with  respect  to  the  initial  vorticity 
conditions  was  used  in  chapter  one.  Here  the  gradient  with  respect  to  the  boundary 
conditions  is  also  needed.  The  expressions  for  all  the  cost  function  gradients  are 
formulated  in  the  appendix. 

An  optimal  estimation  of  open  boundary  conditions  using  the  adjoint  method 
has  been  performed  by  Seiler  (1993).  A  three-layer  quasigeostrophic  model  of  a  mid¬ 
latitude  jet  was  used  to  simulate  the  Gulf  Stream.  For  the  optimization  the  initial 
conditions  were  treated  as  known,  and  the  control  variables  were  the  boundary  values 
of  streamfunction  and  vorticity  in  each  of  the  three  layers  over  a  36  day  period.  The 
assimilation  scheme  and  the  twin  experiment  configurations  used  here  are  very  similar 
to  those  she  used.  Apart  from  the  fact  that  she  assimilated  simulated  altimeter  data, 
the  main  differences  between  this  study  and  Seiler  (1993)  are: 

(i)  Barotropic  versus  three-layers  in  the  vertical;  most  of  the  energetic  flow  in  Seiler’s 
model  occurs  in  the  upper  two  layers,  the  flow  in  the  bottom  layer  is  quite  weak. 
Hence,  in  terms  of  quasigeostrophic  dynamics,  much  of  the  variability  of  the  flow 
field  in  the  three-layer  model  is  occurring  on  the  scale  of  the  baroclinic  radius  of 
deformation  which  is  much  shorter  than  the  width  of  the  model  domain.  In  the 
barotropic  model  used  here,  all  of  the  variability  is  in  the  barotropic  mode  which 
has  an  effectively  infinite  radius  of  deformation.  Hence  the  streamfunction  on  the 
boundary  of  the  barotropic  model  instantly  feels  any  change  in  vorticity  anywhere 
else  in  the  domain.  This  will  be  seen  to  have  a  significant  effect  on  the  gradient  of 
the  cost  function  with  respect  to  boundary  streamfunction. 
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(ii)  Seiler  kept  the  initial  conditions  fixed  (assumed  to  be  known  perfectly),  and 
expanded  the  time-dependent  boundary  values  of  streamfunction  and  vorticity  in  a 
Fourier  sine  series.  The  coefficients  in  the  series  were  used  as  control  variables.  Here 
the  initial  vorticity  conditions  are  treated  as  unknown  and  are  included  in  the  control 
vector  along  with  the  boundary  values.  This  is  clearly  a  more  realistic  approach  as  it 
reduces  the  number  of  a  'priori  assumptions  about  the  flow  field.  When  the  choice  of 
first  guess  for  the  initial  and  boundary  conditions  is  made  along  with  their  expected 
size  and  smoothness,  a  priori  assumptions  are  indeed  made  but  they  are  modified 
during  the  optimization. 

The  novel  results  of  this  chapter  are  the  successful  estimation  of  initial  con¬ 
ditions  together  with  time-dependent  boundary  conditions.  The  assimilation  is  able 
to  create  temporal  variation  at  the  boundaries  that  causes  interior  flow  patterns  that 
well  match  the  strong  signals  in  the  interior  velocity  observations.  The  assimilation 
performs  well  even  when  noise  is  added  to  the  observations  and  the  first  guess  for 
the  control  variables.  In  Seiler  (1993)  the  first  guess  was  chosen  to  be  ten  percent 
different  from  the  true  values  of  the  control  variables.  Here  all  of  the  choices  of  first 
guess  are  much  further  from  their  true  values. 

Several  case  studies  are  presented  here,  showing  some  successful  results  and 
illustrating  the  problems  associated  with  the  estimation  of  boundary  conditions.  The 
quality  of  the  estimated  control  variables  for  each  experiment,  in  terms  of  a  compar¬ 
ison  with  the  true  values  and  their  sensitivity  to  the  data,  is  presented  in  the  next 
chapter. 

3.1  Boundary  values  as  control  variables 

A  first  problem  that  arises  when  boundary  values  of  streamfunction  and  vorticity  are 
used  as  control  variables  in  a  time-dependent  assimilation  is  their  large  number.  For 
a  24  day  assimilation  period  with  At  =  1  hour  and  with  25  grid  points  along  each 
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k 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

Period 

24.00 

12.00 

8.00 

6.00 

4.79 

4.00 

3.42 

3.00 

2.67 

2.38 

2.17 

2.00 

Table  3.1:  Periods  in  days  of  the  expansion  coefficients  for  the  boundary  values  in  equation 
3.1  for  truncation  K  =  12,  for  a  24  day  assimilation  period. 


edge  of  the  model  domain,  there  are  110,  784  values  of  streamfunction  and  vorticity  to 
be  estimated.  Such  a  large  number  of  variables  is  not  necessary  to  describe  the  flow 
field  adequately  because  there  is  a  high  temporal  correlation  in  both  streamfunction 
and  vorticity  at  short  time-scales.  Advantage  is  taken  of  this  correlation  to  reduce 
the  number  of  degrees  of  freedom  in  the  model  that  is  used  for  the  assimilation. 

The  time  series  of  streamfunction  and  vorticity  at  each  point  on  the  boundary. 


here  denoted  qt,  are  each  expanded  in  a  Fourier  series  with  a  linear  trend. 


qt 


^  0  ^ 
7t  Vrti 


a^cos 


(^) 


—  hksin 


'2%kt\ 


+  C; 


(3.1) 


for  t  =  0, . . .  ,r,  and  where  (j)k  =  o-k  +  ih-  Thus  the  time  series  is  represented  by 
a  DC  component,  uq,  K  cosine  and  sine  coefficients,  a*,  and  bk,  and  a  linear  trend 


coefficient,  c.  This  expansion,  without  the  cosine  terms,  is  the  same  as  that  used  by 
Seiler  (1993). 

The  linear  trend  is  removed  from  the  boundary  time-series  before  the  Fourier 
series  is  taken,  so  as  to  render  the  time-series  continuous  at  the  beginning  and  the 
end  of  the  assimilation  period.  By  doing  so  there  is  less  demand  for  high-frequency 


energy  in  the  Fourier  series,  so  the  series  can  be  truncated  at  a  lower  cut-off,  K.  The 
number  of  control  variables  needed  to  adequately  describe  the  boundary  time  series 
is  substantially  reduced  by  truncating  the  series.  By  truncating  the  series  the  assim¬ 
ilation  period  may  be  lengthened  without  increasing  the  number  of  control  variables, 
with  the  trade-off  that  high  frequency  motions  are  filtered  out  at  the  boundaries, 
which  can  introduce  errors. 
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The  forward  model  is  run  for  24  days  with  different  truncations  to  assess  the 
effect  that  the  boundary  expansion  has  on  the  interior  flow  field.  In  figure  3.1  the  error 
is  shown  in  the  streamfunction  and  vorticity  fields  in  the  interior  as  a  function  of  time 
when  the  minimum  periods  allowed  for  in  the  expansion  are  1(K  =  24),  2{K  =  12), 
3{K  =  8)  and  A(K  —  6)  days  where  T  =  577  (24  days).  The  error  is  defined  as 
the  rms  difference  between  the  interior  streamfunction  field  of  the  model  run  with  a 
truncated  Fourier  series  and  the  model  run  with  the  full  temporal  variability  of  the 
boundary  values,  normalized  by  the  rms  variability  of  the  model  run  with  the  full 
temporal  variability.  The  error  increases  as  K  decreases.  The  error  stays  well  below 
one  percent  for  K  —  12,  and  this  is  the  truncation  used  for  the  twin  experiments  in 
this  chapter.  With  this  truncation  the  minimum  period  of  variability  of  the  boundary 
values  is  two  days,  and  the  number  of  control  variables  representing  the  boundary 
values  of  streamfunction  and  vorticity  is  reduced  from  110,784  to  5,521. 

The  control  variables  are  now  the  expansion  coefficients  {ao,  6o, 

—  from  (3.1)  for  streamfunction  and  vorticity  at  each  boundary  point. 

These,  along  with  the  vorticity  initial  conditions  at  interior  points,  are  assembled 
into  one  control  vector.  The  period  corresponding  to  each  expansion  coefficient  is 
shown  in  table  3.1. 

3.2  Scaling  of  time-dependence 

With  boundary  streamfunction  and  vorticity  in  the  frequency  domain  rather  than  the 
time  domain,  the  scaling  of  the  control  variables  for  the  optimization  algorithm  re¬ 
quires  careful  consideration.  In  section  2.2  it  was  shown  how  control  variables  should 
be  scaled  so  as  to  render  the  Hessian  well-conditioned  during  the  minimization.  A 
practical  choice  of  scaling  is  the  square  root  of  the  a  ‘priori  expected  size  variances  for 
each  control  variable.  If  the  control  variables  were  kept  as  time  series  of  streamfunc¬ 
tion  and  vorticity  on  the  boundary,  the  scaling  would  simply  be  the  expected  sizes 
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Figure  3.1:  Growth  of  errors  for  (left)  streamfunction  and  (right)  vorticity  when  variabihty 
with  periods  <  1  day  (solid),  <  2  days  (dash),  <  3  days  (dot)  and  <  4  days  (dot-dash)  is 
filtered  from  the  boundary  values. 

of  streamfunction  and  vorticity.  In  the  frequency  domain  the  expected  sizes  of  the 
Fourier  coefficients  for  a  typical  time  series  may  vary  by  several  orders  of  magnitude. 

Seiler  (1993)  chose  the  scale  vector  by  taking  a  typical  forward  model  integra¬ 
tion,  expanding  the  boundary  values  of  streamfunction  and  vorticity  by  (3.1),  then 
taking  the  rms  values  of  the  coefficients.  It  is  apparent  from  the  behavior  of  the 
dynamics  used  in  the  forward  model  as  presented  in  chapter  one,  that  there  is  lit¬ 
tle  energy  at  high  frequencies  in  the  true  flow  field.  Hence  the  rms  scaling  has  a 
somewhat  red  character. 

As  in  the  experiments  in  chapter  one,  terms  are  included  in  the  cost  function 
expressing  the  size  and  smoothness  of  the  boundary  values  of  streamfunction  and 
vorticity.  Without  these  terms  the  optimization  produces  estimates  that  can  be  ar¬ 
bitrarily  large  and  noisy.  The  inclusion  of  these  terms  supposes  that  one  has  some 
a  priori  estimate  of  the  rms  size  and  smoothness  of  the  streamfunction  and  vorticity 
fields.  The  useful  function  of  the  size  and  smoothness  terms  is  to  lend  some  curvature 
to  the  cost  function  in  regions  devoid  of  data  (Thacker,  1988).  The  cost  function  is 
deemed  adequately  minimized  when  the  residual  variances  in  the  size,  smoothness 
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Experiment 

Scaling 

First  Guess 

Sampling  Interval 
(no.  time  steps) 

Observational 

Noise 

A2 

rms 

zero 

48 

no 

B2 

flat 

zero 

48 

no 

C2 

flat 

persist  true  i.c. 

48 

no 

D2 

flat 

persist  true  i.c. 

1 

no 

E2 

flat 

perturb  25  % 

1 

no 

F2 

flat 

persist  true  i.c. 

48 

yes 

G2 

flat 

persist  obj.  map  i.c. 

48 

yes 

Table  3.2:  Parameters  used  for  the  optimization  for  each  experiment,  the  meaning  of  the 
entries  is  explained  further  in  the  text. 


and  model/data  misfit  terms  are  of  the  same  order  as  their  a  priori  variances.  When 
this  is  the  case,  the  value  of  the  cost  function  is  of  order  unity. 

Twin  experiments  are  run  using  the  same  model  parameters  and  observing 
array  as  those  used  in  cases  B1  and  Cl  in  chapter  one,  except  that  now  the  expansion 
coefficients  of  the  boundary  values  of  streamfunction  and  vorticity  are  included  in 
the  control  variables.  Table  3.2  shows  the  differences  among  the  twin  experiments 
presented  in  this  chapter. 

The  performance  of  the  optimization  for  each  experiment  is  shown  in  figure 
3.2.  Plotted  is  the  value  of  the  cost  function  versus  number  of  iterations.  The  details 
of  an  iteration  of  the  adjoint  method  are  presented  in  section  A. 2  of  the  appendix. 
One  iteration  may  contain  several  calls  to  the  forward  and  adjoint  models,  before  a 
new  minimum  is  found.  For  the  minimization  routine  used  in  all  of  these  experiments, 
the  L-BFGS  limited-memory  quasi-Newton  method,  the  minimization  stops  when 
the  norm  of  the  gradient  vector  falls  below  a  certain  value  (see  appendix,  page  144). 
With  this  method,  approximations  to  the  Hessian  are  calculated  at  each  iteration  so 
that  the  new  minimum  in  the  downhill  direction  can  be  determined.  At  the  start  of 
the  minimization  the  Hessian  of  the  cost  function  is  initialized  as  the  identity  matrix, 
then  as  the  minimization  proceeds  the  Hessian  matrix  is  reconstructed  including  the 
non-diagonal  terms. 
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It  was  seen  in  chapter  two  that  when  the  dynamics  are  strongly  nonlinear, 
the  form  of  the  Hessian  may  depend  upon  the  value  of  the  control  vector  (see  equa¬ 
tion  2.6).  That  is,  the  shape  of  the  cost  function  may  show  large  departures  from  a 
paraboloid,  in  which  case  the  curvature  is  not  constant  in  control  space.  After  a  num¬ 
ber  of  iterations  the  minimization  may  stop  because  according  to  the  reconstructed 
curvature  of  the  cost  function  (the  Hessian),  it  cannot  proceed  downhill  anymore, 
even  if  the  norm  of  the  gradient  is  still  large.  When  this  happens  the  minimization 
is  restarted  at  the  last  position  of  the  control  vector,  with  the  Hessian  re-initialized 
to  the  identity  matrix.  In  figure  3.2,  the  sharp  changes  in  direction  seen  in  the  mini¬ 
mization  curves  correspond  to  points  where  the  minimization  is  restarted. 

Case  A2:  Assuming  no  prior  knowledge  of  the  initial  and  boundary  conditions  save 
an  estimate  of  their  expected  variance  and  smoothness,  the  choice  of  first  guess  for  all 
of  the  control  variables  is  zero.  The  scale  vector  for  this  case  is  chosen  as  the  rms  of 
the  initial  and  boundary  conditions  from  a  typical  model  run,  as  was  done  by  Seiler 
(1993). 

The  optimization  reduces  the  cost  function  adequately  (see  figure  3,2)  although 
requiring  far  more  iterations  than  the  other  cases.  In  figure  3.3  the  streamfunction 
in  the  model  domain  as  a  function  of  time  is  shown,  both  for  the  true  flow  field  from 
which  the  observations  were  taken,  and  for  the  flow  field  arising  from  the  optimally 
estimated  control  variables  for  case  A2.  One  can  see  that  the  estimated  flow  resem¬ 
bles  the  true  flow  only  in  the  vicinity  of  the  observing  array.  In  the  course  of  the 
optimization  the  model/data  misfit  at  the  array  does  not  need  to  force  the  boundary 
streamfunction  and  vorticity  values  very  far  from  their  first  guess  values  of  zero. 

In  the  upper  plot  of  figure  3.4  the  magnitude  of  the  scale  vector  is  plotted,  as 
well  as  the  absolute  value  of  the  estimated  control  vector  at  the  end  of  the  minimiza¬ 
tion.  The  scale  vector  and  the  control  vector  are  ordered  in  the  following  manner; 
the  first  529  elements  are  the  vorticity  initial  conditions  at  all  interior  points,  the 
next  2, 496  are  the  expansion  coefficients  for  streamfunction  along  each  of  the  four 
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Figure  3.3:  Streamfunction  every  8  days  for  (left)  the  true  flow  field  and  (right)  the 
estimated  fiow  field  for  experiment  A2  which  starts  from  a  zero  first  guess,  with  the  rms  of 
the  true  boundary  conditions  used  for  scaling.  (C.L=5000  m^/s) 


Figure  3.4:  Scale  vector  (solid)  and  absolute  value  of  the  estimated  control  vector  (dots) 
for  experiment  A2.  Top:  all  control  variables  (see  text),  and  Bottom:  a  portion  consisting 
of  the  expansion  coefficients  for  streamfunction  at  points  along  the  western  boundary. 
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boundaries  and  the  last  2,496  are  the  expansion  coefficients  for  vorticity  along  each 
of  the  four  boundaries. 

In  the  lower  plot  of  figure  3.4  is  shown  a  blow-up  of  a  portion  of  the  upper 
plot.  Shown  here  are  the  estimated  expansion  coefficents  and  their  scale  for  the 
streamfunction  along  the  western  boundary.  Here  the  first  23  points  correspond  to 
the  values  of  ao  at  the  23  grid  points  along  the  western  boundary.  The  next  23  points 
are  the  values  of  ai  along  the  western  boundary  and  so  on. 

The  red  frequency  dependence  of  the  rms  scaling  is  also  apparent  in  the  esti¬ 
mated  expansion  coefficients  shown  in  figure  3.4.  In  fact  there  appears  to  be  more 
redness  in  the  estimated  expansion  coefficients  than  in  the  scaling.  As  can  be  seen,  the 
DC  components  (ao)  of  streamfunction  and  vorticity  at  boundary  points  are  the  only 
control  variables  whose  magnitudes  are  of  the  same  order  as  their  scale.  The  magni¬ 
tudes  of  the  sine  and  cosine  coefficients  (ajt  and  bk)  and  of  the  linear  trend  coefficient 
(c),  all  fall  well  below  their  scales,  in  some  cases  by  many  orders  of  magnitude. 

The  control  variables  that  are  given  the  larger  a  priori  variances,  and  hence 
the  larger  scales,  are  preferentially  varied  by  the  minimization  algorithm  during  the 
search  for  the  minimum  of  the  cost  function.  The  smaller  the  a  priori  variance  for  a 
control  variable  the  more  it  is  constrained  to  its  a  pnori  value.  This  can  be  seen  in  the 
estimated  expansion  coefficients  for  streamfunction  and  vorticity  on  all  boundaries. 
The  low-frequency  expansion  coefficients  have  larger  a  priori  variances  than  the  high- 
frequency  expansion  coefficients,  and  all  expansion  coefficients  are  given  an  a  priori 
size  of  zero.  Thus  the  estimated  boundary  conditions  are  comprised  almost  entirely  of 
low  frequencies.  The  high  frequency  motions  present  in  the  true  boundary  conditions 
are  not  captured  in  the  estimated  boundary  conditions. 

Various  experiments  were  performed  using  different  forms  of  frequency  depen¬ 
dence  to  construct  the  a  priori  variances  of  the  expansion  coefficients  (such  as 

^For  these  experiments,  the  model  domain  hets  n  =  25  points  in  the  zonal  direction,  and  m  =  25 
points  in  the  meridional  directions.  The  corner  points  are  included  in  the  northern  and  southern 
boundaries  for  the  organization  of  the  control  vector 
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It  was  found  that  the  rms  scaling  used  in  the  above  experiment,  the 
solid  line  in  figure  3.4,  is  approximately  proportional  to  frequency~^.  This  particular 
scaling  arises  naturally  out  of  quasigeostrophic  turbulence. 

To  get  temporal  variability  in  the  estimated  boundary  conditions  of  the  same 
order  as  the  temporal  variability  in  the  true  boundary  conditions,  it  was  found  neces¬ 
sary  to  use  a  “flat”  scaling  that  is  constant  with  respect  to  frequency.  All  of  the  other 
frequency  dependencies  produce  boundary  estimates  that  are  nearly  time— invariant. 
In  fact  the  flat  scaling  is  a  more  sensible  choice  than  the  rms  scaling  as  it  is  a  statement 
that  one  has  no  reason  to  weight  one  frequency  higher  or  lower  relative  to  another 
frequency.  Furthermore  the  rms  scaling  chosen  in  case  A2  can  only  be  used  in  the 
context  of  twin  experiments  where  one  is  in  possession  of  the  spectral  density  of  the 
true  flow  field.  For  real  data  this  sort  of  information  is  not  available  and  a  flat  scaling 
is  more  appropriate. 

Case  B2:  The  parameters  for  this  experiment  are  all  the  same  as  those  used  in  case  A2, 
except  that  a  flat  scaling  is  used  where  all  expansion  coefficients  are  given  the  same  a 
priori  variance  as  that  of  the  DC  components.  Although  the  optimization  performs 
well  in  that  the  cost  function  is  adequately  minimized  (see  figure  3.2),  the  estimated 
flow  field  does  not  seem  to  resemble  the  true  flow  field  at  all  even  in  the  vicinity 
of  the  observing  array,  as  can  be  seen  in  figure  3.5  where  the  true  and  estimated 
streamfunction  fields  are  displayed. 

In  figure  3.6  is  shown  the  scale  vector  and  estimated  control  vector.  The  control 
variables  that  have  been  shifted  significantly  from  zero  relative  to  their  scales  are:  the 
vorticity  initial  conditions,  the  low-frequency  expansion  coefficients  on  the  southern 
and  western  boundaries,  and  the  high-frequency  expansion  coefficients.  One  can  see 
from  figure  3.6  that  the  flat  scaling  has  allowed  the  optimization  to  create  variability 
at  high  frequencies. 

The  model/data  misfit  at  the  data  sampling  times  (day  1,  day  3,. . .,  day  23) 
is  small  as  the  cost  function  is  well  minimized.  However  at  time-steps  other  than 
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Figure  3.5:  Same  as  figure  3.3  but  for  case  B2  which  uses  a  flat  scaling. 


the  data  sampling  times,  the  estimated  flow  fleld  can  be  radically  different  from 
the  true  flow  field  as  the  estimated  high-frequency  expansion  coefficients  have  large 
amplitudes.  Since  all  of  the  expansion  coefficients  for  boundary  streamfunction  and 
vorticity  have  the  same  scale,  and  are  zero  at  the  beginning  of  the  optimization,  the 
optimization  can  efficiently  minimize  the  cost  function  by  varying  equally  the  high 
and  low  frequency  expansion  coefficients.  Between  successive  data  sampling  times 
the  large  amplitudes  in  the  high  frequencies  can  make  the  estimated  flow  markedly 
different  from  the  true  flow,  which  has  little  variability  at  high  frequencies.  This  is 
why  the  snapshots  of  the  estimated  streamfunction  field  at  day  0,  day  8,  day  16  and 
day  24  displayed  in  figure  3.5,  do  not  resemble  the  true  streamfunction  fields. 

Note  that  the  amplitudes  of  the  estimated  control  variables  are  in  general  much 
smaller  than  their  scales.  Due  to  the  choice  of  zero  as  a  first  guess,  most  of  the  control 
variables  have  not  been  moved  very  far  from  zero,  relative  to  the  scale.  This  can  be 
seen  in  figure  3.5  where  the  estimated  streamfunction  throughout  the  sub-domain 
has  a  much  smaller  amplitude  than  the  true  streamfunction. 

For  both  case  A2  and  case  B2  the  values  of  streamfunction  and  vorticity  at 
the  corner  points  of  the  domain  are  not  shifted  much  from  their  a  priori  values  of 
zero,  compared  to  other  points  on  the  boundary.  This  can  be  seen  in  figures  3.3  and 
3.5  where  the  estimated  streamfunction  goes  to  zero  in  each  of  the  four  corners  of  the 
model  domain.  This  is  due  to  the  finite-difference  formulation  of  the  equations  gov¬ 
erning  streamfunction  and  vorticity.  The  five-point  Laplacian  relating  streamfunction 
to  vorticity,  and  the  nine-point  Arakawa  Jacobian  in  the  prognostic  equation,  enable 
streamfunction  at  all  boundary  points  to  feel  the  presence  of  vorticity  in  the  interior, 
except  at  the  corner  points.  At  the  corners,  streamfunction  and  vorticity  are  affected 
by  interior  values  only  through  the  nine-point  Arakawa  Jacobian,  since  the  five-point 
Laplacian  does  not  include  the  corner  points. 

Hence  the  estimates  of  streamfunction  and  vorticity  at  the  corners  are  more 
dependent  on  the  first  guess  of  the  flow  field  in  the  vicinity  of  the  corners.  For  the 
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zero  first  guess  cases  there  is  no  advection  on  the  boundaries  at  the  start  of  the  min¬ 
imization,  so  it  is  difficult  for  the  corner  values  to  change  through  information  being 
spread  by  the  Jacobian.  Better  results  could  be  obtained  by  starting  the  optimization 
with  a  more  informed  first  guess  for  the  control  variables. 


3.3  Choice  of  first  guess 

For  an  optimization  problem  with  nonlinear  constraints,  it  is  possible  for  the  optimal 
solution  to  be  dependent  upon  the  first  guess  for  the  control  variables  (Gill  et  al, 
1981).  For  a  linear  problem  it  is  generally  possible  to  find  the  global  minimum  of  the 
cost  function,  at  which  the  optimally  estimated  control  variables  do  not  depend  on 
their  first  guess.  Nonlinearity  can  cause  the  presence  of  multiple  local  minima  in  the 
cost  function.  In  any  gradient-search  algorithm  the  local  minimum  found  depends 
upon  where  the  search  was  started  from. 

The  first  guess  for  the  control  variables  that  is  passed  to  the  minimization 
routine  should  thus  be  chosen  to  be  as  close  as  possible  to  the  true  values,  on  the 
basis  of  any  a  priori  information  available.  Using  the  same  approach  as  in  chapter 
one,  experiments  are  run  using  first  guesses  for  the  control  variables  that  differ  in 
the  amount  of  a  priori  information  available.  Optimizations  with  the  following  first 
guesses  were  performed: 

(i)  zero  for  all  initial  and  boundary  conditions  (A2  &  B2) 

(ii)  persistence  in  time  of  true  initial  conditions  (C2,  D2  &  F2) 

(iii)  true  initial  and  boundary  conditions  randomly  perturbed  by  25%  (E2) 

(iv)  persistence  in  time  of  objectively  mapped  initial  conditions  (G2) 

The  scaling  used  for  the  experiments  in  this  section  is  the  flat  scaling  used  in  case  B2: 
a  constant  value  for  all  streamfunction  variables  and  a  constant  value  for  all  vorticity 
variables,  that  is,  no  frequency  dependence. 
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3.3.1  Persistence 


Case  C2:  The  first  guess  for  the  control  variables  is  persistence  in  time  of  the  true 
initial  conditions.  That  is,  there  is  no  temporal  variation  in  the  boundary  values  of 
streamfunction  and  vorticity.  Hence  the  first  guess  for  all  expansion  coefficients  is 
zero  except  for  the  DC  components  which  have  their  true  values.  This  experiment 
simulates  the  realistic  situation  of  having  a  hydrographic  survey  at  the  start  of  a 
mooring  time  series.  The  optimization  performs  well  compared  to  the  other  experi¬ 
ments.  The  cost  is  minimized  to  a  lower  value,  in  fewer  iterations,  than  in  any  other 
experiment  as  can  be  seen  in  figure  3.2.  In  figure  3.7  is  shown  the  time  series  of  each 
of  the  observed  velocities  together  with  the  velocities  at  the  same  locations  produced 
by.  the  estimated  control  variables.  The  fit  is  quite  good  in  that  the  residuals  between 
the  observed  and  assimilated  velocities  in  general  fall  well  below  the  a  priori  velocity 
error  of  4  cm! sec. 

In  figure  3.8  the  true  and  estimated  streamfunction  fields  are  shown.  By  cre¬ 
ating  temporal  variation  in  the  boundary  values  of  streamfunction  in  the  southwest 
corner,  the  optimization  is  able  to  cause  a  meander  to  swing  through  the  array  and 
match  the  observed  velocities.  The  true  and  estimated  vorticity  fields  are  shown  in 
figure  3.9.  In  spite  of  the  persistence  in  time  of  the  vorticity  field  at  the  start  of 
the  optimization,  the  assimilation  forces  the  vorticity  to  have  temporal  variations  at 
the  boundaries,  resulting  in  the  good  agreement  seen  between  the  true  and  estimated 
fields  at  later  times. 

The  estimated  control  vector  and  scale  vector  are  shown  in  figure  3.10.  The 
estimated  DC  components  and  the  vorticity  initial  conditions  have  amplitudes  of 
the  order  of  their  scale,  as  one  would  expect  since  they  were  given  their  true  values 
as  the  first  guess.  The  other  expansion  coefficients  are  of  similar  magnitude  to  those 
estimated  in  case  B2,  except  that  there  is  slightly  less  energy  in  the  12  day  coefficients. 

There  is  however  some  two-day  periodicity  in  the  estimated  boundary  stream- 
function.  In  figure  3.11  the  mean  is  shown  of  the  estimated  streamfunction  on  the 
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Figure  3.7:  Time  series  of  observed  velocities  (solid)  and  assimilated  velocities  (dashed)  at 
the  five  current  meters  (left)  and  between  the  two  pairs  of  tomographic  transceivers  (right) 
for  case  C2,  a  priori  error  is  4  cm/sec. 


western  boundary  as  a  function  of  time,  as  well  as  the  mean  of  the  estimated  vorticity 
on  the  western  boundary  as  a  function  of  time.  The  “wiggliness”  displayed  by  the 
streamfunction,  and  not  by  the  vorticity,  is  due  to  the  two-day  periodicity  in  the 
data  sampling.  The  process  by  which  the  data  sampling  period  becomes  energetic 
in  the  estimated  boundary  streamfunction  is  explained  in  the  appendix,  and  further 
discussed  in  section  4.3.  Essentially  it  is  a  consequence  of  the  ellipticity  of  the  Poisson 
equation  for  streamfunction. 

In  an  attempt  to  estimate  boundary  conditions  that  resemble  more  the  true 
boundary  conditions,  observations  of  velocity  are  given  at  every  time-step.  In  a 
real  assimilation  sparsely  sampled  observations  can  be  interpolated  onto  every  time- 
step  with  quite  successful  results,  because  for  this  and  many  other  ocean  models  the 
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Figure  3.8:  Same  as  figure  3.5  but  for  case  C2  which  uses  persistence  in  time  of  the  true 
initial  conditions  as  the  first  guess. 


Figure  3.9:  Vorticity  every  8  days  for  (left)  the  true  flow  fleld  and  (right)  the  estimated 
flow  field  for  experiment  C2  which  uses  persistence  in  time  of  the  true  initial  conditions  as 
the  first  guess  for  the  control  variables.  (C.L=  10“®  5“^) 
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Figure  3.10:  Same  as  figure  3.6  but  for  case  C2  which  uses  persistence  in  time  of  the  true 
initial  conditions  as  the  first  guess. 
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variability  of  interest  is  at  time  scales  much  longer  than  two  days.  The  disadvantage 
of  doing  this  is  that  the  number  of  data  points  increases  and  strictly  they  can  no 
longer  be  treated  as  independent  of  each  other.  This  has  important  consequences  for 
the  error  estimation  in  the  next  chapter. 

Case  D2:  The  same  first  guess  and  scaling  for  the  control  variables  are  used  as  in  case 
C2,  but  here  there  are  observations  supplied  at  every  time-step.  The  optimization 
performs  well  (see  figure  3.2),  and  the  minimization  is  very  similar  to  that  of  case  C2. 

For  this  case  the  optimization  finds  a  minimum  in  fewer  iterations  than  in  case 
C2  (see  figure  3.2).  The  high-frequency  components  of  the  gradient  vector  have  much 
less  amplitude  in  case  D2,  because  the  forcing  in  the  adjoint  equations  is  smoother 
in  time.  Hence  in  the  course  of  the  minimization  the  norm  of  the  gradient  vector  is 
smaller,  and  the  minimization  stops  sooner,  than  in  case  C2.  This  result  indicates 
that  the  contours  of  the  cost  function  (in  the  space  spanned  by  the  control  vector) 
are  smoother  when  there  is  data  at  every  time-step  compared  to  data  at  every  48th 
time-step. 

The  estimated  streamfunction  field  for  case  D2  shown  in  figure  3.12  looks  very 
similar  to  the  estimated  streamfunction  field  for  case  C2  shown  in  figure  3.8.  The 
scale  vector  and  estimated  control  vector  for  case  D2  are  shown  in  figure  3.13.  Note 
that  there  is  little  variability  in  the  estimated  high-frequency  expansion  coefficients. 
The  frequency  dependence  of  the  magnitudes  of  the  estimated  expansion  coefficients 
has  a  red  character,  which  is  desirable  as  the  spectral  density  of  the  true  boundary 
conditions  is  red. 

The  final  value  of  the  cost  function  is  higher  than  in  case  C2,  yet  still  less  than 
one.  The  cost  function  is  minimized  to  a  smaller  value  in  case  C2  because  there  is 
less  data  for  the  same  number  of  control  variables.  Hence  there  are  relatively  more 
degrees  of  freedom  in  fitting  the  control  variables  to  the  data  in  case  C2. 


87 


#  of  control  variable 


Figure  3.13:  Same  as  figure  3.10  but  for  case  D2  which  has  data  at  every  time-step 
instead  of  every  48  time-steps. 


In  figure  3.14  the  mean  is  shown  of  the  estimated  streamfunction  on  the  western 
boundary  as  a  function  of  time,  as  well  as  the  mean  of  the  estimated  vorticity  on  the 
western  boundary  as  a  function  of  time,  for  comparison  with  figure  3.11. 

As  it  can  be  seen,  the  two-day  variability  in  the  estimated  boundary  stream- 
function  is  not  present  for  case  D2.  The  estimated  boundary  streamfunction  jumps 
at  observation  times.  For  case  C2  these  jumps  are  at  day  1,  day  3,. . day  23.  For 
case  D2  there  are  no  jumps  as  every  time-step  is  an  observation  time.  At  grid-points 
where  the  model  equations  are  evaluated,  jumps  in  the  streamfunction  field  at  the 
observation  times  are  smoothed  in  time  by  the  dynamics  itself.  In  the  finite-difference 
formulation  of  the  forward  and  adjoint  model  equations  used  here,  the  model  equa¬ 
tions  are  not  evaluated  directly  at  the  boundary  points.  Hence  the  jumps  in  the 
boundary  streamfunction  are  not  smoothed  in  time  by  the  dynamics  acting  as  an 
interpolater.  This  phenomenon  is  further  discussed  in  section  4.3,  and  is  explained  in 
in  terms  of  the  adjoint  model  equations  in  the  appendix. 

The  estimated  boundary  vorticity  is  not  sensitive  to  the  sampling  interval 
of  the  data.  The  boundary  vorticity  only  influences  the  model  counterparts  of  the 
data  by  being  advected  through  the  interior  by  the  prognostic  vorticity  equation. 
Unlike  the  boundary  streamfunction,  the  boundary  vorticity  does  not  influence  the 
model/data  misfit  instantaneously  at  observation  times.  The  mean  of  the  estimated 
boundary  vorticity  as  a  function  of  time  is  shown  to  be  quite  similar  in  both  case  C2 
and  D2  as  the  data  provide  the  same  information. 

3.3.2  Adding  noise 

In  the  two  experiments  of  the  last  section,  the  vorticity  initial  conditions  and  the 
velocity  observations  were  assumed  to  be  perfectly  known.  In  this  section  noise  is 
added  to  the  first  guess  and  then  to  the  velocity  observations,  so  as  to  render  the 
experimental  set-up  more  realistic. 
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Figure  3.14:  The  mean  streamfunction  (Top)  and  mean  vorticity  (Bottom)  on  the  western 
boundary  as  a  function  of  time,  taken  from  the  estimated  control  variables  for  case  D2. 
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Case  E2:  Here  the  first  guess  for  the  control  variables  is  chosen  by  randomly  per¬ 
turbing  the  true  initial  and  boundary  conditions  by  25  percent.  This  is  similar  to 
how  first  guesses  for  the  control  variables  were  chosen  by  Seiler  (1993).  More  than 
for  any  other  experiment  presented  here,  the  first  guess  for  the  initial  and  boundary 
conditions  is  closest  to  their  true  values.  Although  in  case  C2  persistence  of  the  true 
initial  conditions  was  used  as  a  first  guess,  that  first  guess  did  not  contain  any  of  the 
temporal  variability  associated  with  the  meander  event. 

The  optimization  minimizes  the  cost  function  well,  in  much  the  same  manner 
as  case  C2  and  case  D2,  as  shown  in  figure  3.2.  Note  that  the  final  values  of  the  cost 
function  for  cases  D2  and  E2  are  larger  than  the  final  value  of  the  cost  function  for 
case  C2.  When  the  data  is  sampled  at  every  time-step  (D2  and  E2),  the  mean-square 
residual  of  the  model/data  misfit  tends  to  be  larger  than  when  the  data  is  sampled 
at  every  48th  time-step. 

The  estimated  streamfunction  field  well  resembles  the  true  streamfunction  field 
as  can  be  seen  in  figure  3.15.  Figure  3.16  shows  an  example  of  how  the  optimization 
changes  the  values  of  the  control  variables  away  from  their  first  guess  to  be  more 
similar  to  the  values  of  the  true  control  variables.  Plotted  is  the  streamfunction  and 
vorticity  at  t  =  0  along  the  western  boundary.  One  can  see  that  the  first  guess  (dotted) 
has  been  varied  towards  the  true  values  (solid)  in  the  course  of  the  optimization, 
resulting  in  the  estimate  (dotdash)  when  the  minimum  is  found.  This  is  probably  due 
to  the  smoothness  term  in  the  cost  function,  which  penalizes  the  second  derivative  of 
streamfunction  and  vorticity  around  the  boundary. 

Randomly  perturbing  the  true  values  of  the  control  variables  by  25  percent 
as  a  choice  of  first  guess  is  not  generally  useful.  This  type  of  first  guess  can  only 
be  used  in  the  context  of  twin  experiments  where  the  true  values  are  known.  Case 
E2  is  presented  here  as  a  connection  to  Seiler’s  study.  Also,  case  E2  shows  how  the 
adjoint  method  can  force  the  control  variables  towards  their  true  values  when  the 
optimization  is  started  near  the  true  values.  This  may  seem  an  obvious  point,  but  in 
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Figure  3.16:  Case  E2,  plotted  are  values  of  (top)  streamfunction  and  (bottom)  vorticity 
along  the  western  boundary  at  t=0,  north  is  to  the  left,  (solid)  True,  (dotted)  First  guess 
and  (dotdash)  optimal  estimate. 

all  of  the  other  cases  the  control  variables  are  not  forced  towards  their  true  values. 
The  optimization  in  each  case  finds  the  local  minimum  nearest  to  where  minimization 
is  started  from. 

Case  F2:  This  case  keeps  all  the  same  experimental  parameters  as  for  case  C2,  except 
that  noise  is  added  to  the  observations.  The  observational  noise  is  taken  from  a 
normally  distributed  random  number  generator  with  a  standard  deviation  of  4  cmf  s, 
which  is  the  a  priori  error  in  all  of  the  velocity  observations  in  this  study. 
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The  estimated  flow  fleld  is  very  similar  to  the  estimated  flow  field  for  case  C2 
shown  in  figure  3.8.  The  cost  function  is  well  minimized  (see  figure  3.2)  although  not 
to  as  low  a  value  as  in  case  C2  where  perfect  observations  of  the  true  flow  field  were 
available.  The  velocity  time-series  at  the  observing  array  are  shown  in  figure  3.17. 
The  true  velocities  (solid),  the  estimated  velocities  (dashed),  and  the  noisy  observed 
velocities  (dotted)  are  plotted  together,  in  separate  frames  for  each  current  meter  and 
the  tomography.  An  errorbar  of  4  cm/s  is  located  in  the  upper  right  corner  of  each 
frame.  The  optimization  has  successfully  reduced  the  size  of  the  residuals  between 
the  estimated  and  true  velocity  curves  to  the  order  of  the  a  priori  error. 

3.3.3  S imulat ed  hydrography 

For  a  model/data  assimilation  using  real  data,  a  first  guess  for  the  initial  and  boundary 
conditions  would  be  constructed  from  a  hydrographic  survey  or  from  climatology.  For 
these  choices  of  first  guess,  the  streamfunction  and  vorticity  values  from  the  survey 
or  climatology  need  to  be  mapped  onto  the  grid-points  of  the  model  sub-domain. 
Case  G2:  To  simulate  the  scenario  where  one  has  sparse  observations  of  temperature 
and  salinity  from  a  hydrographic  survey,  sparse  observations  are  taken  of  the  true 
streamfunction  field  at  the  initial  time.  These  streamfunction  observations  are  then 
objectively  mapped  onto  the  grid-points  of  the  model  sub-domain,  and  the  initial 
vorticity  is  calculated  from  the  mapped  streamfunction  field. 

Objective  mapping,  or  objective  analysis,  is  a  common  means  of  interpolating 
irregularly  spaced  data  onto  a  regular  grid.  Bretherton  et  al.  (1976)  give  details  of 
this  method,  which  is  based  upon  Gauss-Markov  theory.  Hogg  (1993,  1994)  applied 
objective  mapping  to  the  SYNOP  East  dataset  to  spatially  interpolate  temperature, 
salinity  and  velocity  observations  taken  at  the  observing  array. 

The  locations  at  which  observations  of  streamfunction  are  taken  are  shown  in 
figure  3.18,  along  with  the  objectively  mapped  streamfunction  field  over  the  model 
sub-domain.  The  observing  locations  are  100  km  (five  grid-points)  apart.  An 
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Figure  3.17:  Time-series  of  eastward  velocity,  u,  and  northward  velocity,  v,  at  the  five 
current  meters  and  the  east-west  and  north-south  tomographic  averages.  (Solid)  True 
velocities,  (Dashed)  estimated  velocities  and  (Dotted)  true  velocities  with  noise  added,  for 
case  F2  are  shown.  An  errorbar  of  4  cm  Is  is  located  on  the  right-hand  side  of  each  frame. 


Figure  3.18:  The  objectively  mapped  streamfunction  field  used  as  a  first  guess  for  the 
initial  conditions  for  case  G2  (C. I. =5000  m^/s).  The  observing  locations  are  marked  by 
asterisks. 

isotropic  gaussian  spatial  covariance  function  with  a  length-scale  of  200  km  is  used 
in  the  objective  mapping,  and  the  uncertainty  in  the  streamfunction  observations 
is  assumed  to  be  ten  percent  of  the  total  rms  variability  of  the  streamfunction  field. 
These  parameters  are  of  great  importance  to  the  mapping.  Here  they  are  given  rather 
arbitrary  values,  but  for  a  real  model/data  assimilation  more  informed  choices  could 
be  made  (Hogg,  1993).  The  resemblance  of  the  mapped  streamfunction  field  to  the 
true  streamfunction  field  may  be  seen  in  the  upper  left  frame  of  figure  3.19. 

The  vorticity  field  is  calculated  from  the  mapped  streamfunction  to  provide 
the  first  guess  for  the  vorticity  initial  conditions  for  this  experiment.  Persistence  in 
time  of  the  boundary  values  of  the  mapped  streamfunction  and  vorticity  is  used  as 
the  first  guess  for  the  boundary  conditions. 

An  important  issue  that  arises  at  this  stage  is  the  choice  of  a  priori  covariance 
matrix  for  the  control  variables,  which  sets  the  scaling  for  the  control  variables  in 
the  optimization.  The  objective  mapping  produces  expected  error  covariance  maps 
for  both  streamfunction  and  vorticity,  and  these  could  be  used  to  assign  a  priori 
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covariances  to  the  vorticity  initial  conditions.  The  expected  error  covariances  for 
the  vorticity  field  that  are  calculated  from  the  expected  error  covariances  for  the 
mapped  streamfunction  are  very  large,  even  when  the  error  in  the  streamfunction 
field  is  relatively  small,  because  the  errors  are  magnified  by  the  Laplacian  relating 
streamfunction  to  vorticity.  For  large  a  priori  variances  in  the  initial  and  boundary 
vorticity,  the  optimization  may  estimate  large  vorticity  values  that  cause  the  forward 
model  to  become  numerically  unstable. 

For  this  experiment  the  a  priori  covariances  for  the  control  variables  are  ini¬ 
tialized  to  the  same  values  as  in  case  C2.  The  objective  mapping  is  used  to  construct 
streamfunction  and  vorticity  fields  to  be  used  as  a  first  guess  for  the  initial  and  bound¬ 
ary  conditions,  and  the  error  covariance  estimates  from  the  objective  mapping  are  not 
utilised.  Further  experiments  would  investigate  the  applicability  of  the  objective  map¬ 
ping  error  covariances  to  a  time— dependent  model/data  assimilation.  It  is  not  clear 
at  this  stage  how  one  would  select  the  a  priori  covariances  for  the  time-dependent 
boundary  conditions.  A  main  purpose  of  this  study  is  to  examine  the  improvement 
that  assimilation  of  interior  data  with  strongly  nonlinear  dynamics  gives  to  estimated 
boundary  conditions,  so  a  simple  form  for  their  a  priori  covariances  is  chosen  as  a 
first  step. 

The  true  and  estimated  streamfunction  fields  are  shown  in  figure  3.19,  and  the 
minimization  of  the  cost  function  performs  very  similarly  to  case  F2  (see  figure  3.2). 
As  with  the  other  experiments  in  this  chapter  the  optimization  is  able  to  vary  the 
boundary  streamfunction  and  vorticity  away  from  persistence  and  to  cause  a  meander 
to  swing  through  the  observing  array.  The  good  performance  of  this  experiment  bodes 
well  for  the  assimilation  of  the  SYNOP  Eastern  Array  velocities,  with  the  first  guess 
for  the  control  variables  taken  from  the  initial  hydrographic  survey. 

The  results  of  this  chapter  show  that  for  the  experimental  setup  used  here 
there  are  many  local  minima  of  the  cost  function,  that  depend  upon  the  first  guess 
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Figure  3.19:  Same  as  figure  3.8  but  for  case  G2  for  which  there  is  noise  added  to  the 
velocity  observations,  and  the  first  guess  for  the  vorticity  initial  conditions  is  taken  from  an 
objective  mapping  of  a  simulated  hydrographic  survey  of  streamfunction. 
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for  the  control  variables.  Further  comparisons  between  the  cases  will  be  presented  in 
the  next  chapter. 

For  actual  observations,  case  G2  is  the  most  likely  scenario  that  would  be  used 
to  assimilate  data  into  a  model.  In  the  real  world,  one  is  likely  to  have  a  snapshot 
of  the  ocean  state  from  a  hydrographic  survey  followed  by  time-dependent  sparse 
observations  such  as  mooring  time  series,  as  in  the  SYNOP  Eastern  Array  dataset. 
Case  C2  may  be  regarded  as  the  most  successful  of  the  twin  experiments  in  the  sense 
that  the  cost  function  is  minimized  to  the  smallest  value,  the  first  guess  and  scaling 
are  chosen  using  reasonable  and  realistic  assumptions,  and  the  data  is  sparse  in  space 
and  time. 
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Chapter  4 


Quality  of  the  estimated  boundary 
conditions 


The  quality  of  the  fit  of  the  estimated  control  variables  to  the  velocity  observations 
is  examined  in  this  chapter.  First  the  residuals  of  the  state  variables  and  control 
variables  with  their  true  values  are  presented,  in  the  context  of  the  twin  experiments. 
Then  the  results  of  case  C2  are  used  to  calculate  the  sensitivity  and  Hessian  matrices. 
The  sensitivity  matrix  is  used  to  investigate  the  sensitivity  of  the  estimated  control 
variables  to  the  observations.  The  processes  by  which  information  is  spread  during 
the  optimization  are  discussed.  The  estimated  error  variances  for  the  control  variables 
are  calculated  from  the  inverse  Hessian  using  the  sensitivity  matrix  method  that  was 
described  in  chapter  two.  Finally,  error  maps  for  the  interior  streamfunction  and 
vorticity  fields  are  constructed. 

The  novel  results  of  this  chapter  are  the  estimation  of  errors  for  streamfunction 
and  vorticity,  at  all  boundary  points  and  all  interior  points  at  all  times,  when  the 
dynamics  are  strongly  nonlinear. 
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4.1  Comparison  with  true  values 


The  use  of  twin  experiments  allows  a  comparison  of  any  feature  of  the  estimated  flow 
field  with  that  of  the  true  flow  field.  Some  measures  of  the  difference  between  the 
true  flow  field  and  the  estimated  flow  field  are  now  presented.  The  error  measures 
are  expressed  in  terms  of  a  normalized  rms  residual. 

When  these  error  measures  are  of  order  one,  the  mean-square  residual  between 
the  estimates  and  their  true  values  is  of  the  same  order  as  the  variance  in  the  true 
values,  indicating  that  the  estimates  are  as  close  to  the  true  values  as  would  be  any 
other  realization  of  the  flow  field.  When  the  normalized  rms  residual  is  less  than  one, 
one  can  say  that  the  estimates  are  probably  closer  to  the  true  values  than  some  other 
realization  of  the  flow  field.  In  the  following  expressions,  i/'h,  are  the  true  values 
and  Cij  are  the  estimated  values  of  the  streamfunction  and  vorticity  respectively. 
The  total  streamfunction  error: 

^  'Eij, 

^T,  =  - 7-7T2 

1.  (V'ij)  J 

where  the  summation  is  over  all  interior  points  (i  =  2,  . . . ,  n  —  1  and  j  =  2,  . . . , 
m  —  1  )  and  for  all  t  =  0,  . . . ,  T. 

The  total  vorticity  error 

where  the  summation  is  over  all  interior  points  and  for  all  t  =  1,  . . . ,  T. 


The  boundary  streamfunction  error 

^  Y^b,t  (J’b  “  ■’/’b) 

The  boundary  vorticity  error 


1 

2 
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Experiment 

^Ts 

£tv 

^hs 

^bv 

r. 

-'XV 

A2 

0.79 

(1.00) 

1.31 

(1.00) 

0.81 

(1.00) 

1.09 

(1.00) 

0.91 

(1.00) 

B2 

0.96 

(1.00) 

1.05 

(1.00) 

0.99 

(1.00) 

1.03 

(1.00) 

0.97 

(1.00) 

C2 

0.36 

(0.64) 

0.63 

(0.96) 

0.43 

(0.41) 

1.03 

(1.07) 

0.02 

(0.00) 

D2 

0.33 

(0.64) 

0.69 

(0.96) 

0.43 

(0.41) 

1.04 

(1.07) 

0.01 

(0.00) 

E2 

0.06 

(0.21) 

0.21 

(0.44) 

0.11 

(0.18) 

0.20 

(0.29) 

0.26 

(0.26) 

F2 

0.46 

(0.64) 

0.68 

(0.96) 

0.62 

(0.41) 

1.05 

(1.07) 

0.03 

(0.00) 

G2 

0.53 

(0.89) 

0.82 

(1.18) 

0.66 

(0.61) 

0.95 

(1.02) 

0.66 

(0.66) 

Table  4.1:  Error  measures.  The  values  in  parentheses  are  the  error  measures  for  the 
streamfunction  and  vorticity  fields  resulting  from  the  first  guess  for  the  control  variables. 

where  the  summation  is  over  the  points  along  all  four  boundaries  and  over  all  time- 
steps. 

The  initial  vorticity  error 

I  (cs)  J 

where  the  summation  is  over  all  interior  points. 

These  error  measures  are  displayed  in  table  4.1,  for  the  experiments  of  chapter 
three.  Shs,  £bv  and  £„  are  all  measures  of  the  error  in  the  estimated  control  variables 
relative  to  the  true  control  variables.  Sts  and  Sxv  are  called  measures  of  error  in 
the  estimated  state  variables  relative  to  their  true  values.  State  variables  are  model 
variables,  not  included  in  the  control  vector,  whose  values  depend  on  the  control 
variables.  Shown  in  parentheses  are  the  error  measures  at  the  start  of  the  assimilation. 
These  are  the  errors  in  the  flow  field  resulting  from  the  first  guess  for  the  control 
variables. 
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The  streamfunction  errors  are  all  less  than  the  vorticity  errors.  One  reason  for 
this  is  that  the  streamfunction  field  is  more  readily  adjusted  during  the  assimilation 
than  the  vorticity  field,  due  to  the  ellipticity  of  the  Poisson  equation  (this  point  is 
illustrated  further  in  the  next  section).  Another  reason  for  the  streamfunction  errors 
being  lower  than  the  vorticity  errors  is  that  the  vorticity  field  has  shorter  length  scales 
of  variability  compared  to  the  streamfunction  field  (see  figure  1.1).  The  fine  structure 
of  the  estimated  vorticity  field,  as  one  moves  away  from  the  observing  array,  may 
differ  substantially  from  that  of  the  true  vorticity  field. 

The  total  errors  and  the  boundary  and  initial  errors  for  case  A2  and  case 
B2  are  all  of  order  one,  indicating  that  the  estimated  streamfunction  and  vorticity 
fields  throughout  the  domain  are  not  at  all  similar  to  the  true  fields.  This  was 
already  glimpsed  in  the  streamfunction  snapshots  in  figures  3.3  and  3.5.  Although 
the  cost  function  is  minimized  to  a  sufficiently  low  value,  so  that  the  velocity  error 
at  the  observing  array  is  of  the  order  of  the  a  •priori  error,  the  similarity  between  the 
estimated  and  true  flow  fields  decreases  as  one  moves  away  from  the  observing  array. 
Zero  was  the  first  guess  for  all  of  the  control  variables  for  both  of  these  cases.  This  is 
obviously  a  poor  choice,  so  it  is  hardly  surprising  that  the  residuals  are  of  the  order 
of  the  true  variability. 

For  cases  C2  and  D2,  where  persistence  in  time  of  the  true  initial  conditions 
is  used  as  a  first  guess,  the  total  errors  are  improved  somewhat  but  the  boundary 
errors  are  of  the  order  of  the  boundary  errors  for  the  first  guess.  The  first  guess  for 
these  cases  has  a  stationary  jet  entering  the  domain  through  the  southwest  corner, 
and  leaving  through  the  southeast  corner.  The  boundary  values  of  streamfunction 
and  vorticity  near  the  inflow  may  be  varied  effectively  in  the  optimization  in  order 
that  the  velocities  observed  downstream  of  the  inflow,  at  the  observing  array,  agree 
well  with  the  true  velocities.  Thus  information  from  the  model/data  misfit  at  the 
array  is  spread  through  some  of  the  domain,  somewhat  reducing  the  total  errors  from 
their  first  guess  values.  The  estimated  boundary  values  are  changed  away  from  their 
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first  guess  mainly  in  the  vicinity  of  the  inflow,  and  they  are  not  forced  to  resemble 
the  true  values  elsewhere  around  the  boundary.  This  last  point  is  illustrated  further 
in  the  next  section  when  the  sensitivity  matrix  is  examined. 

For  all  cases  the  error  in  the  estimated  vorticity  initial  conditions  does  not  de¬ 
crease  from  the  error  for  the  first  guess.  As  was  found  in  chapter  one,  the  optimization 
preferentially  fits  the  strongest  signals  in  the  observations  which,  in  the  experiments 
presented  here,  is  the  meander  event  that  swings  through  the  observing  array  at  later 
times.  The  time— dependent  boundary  values  have  more  influence  on  the  meander 
event  than  do  the  initial  conditions,  hence  the  optimization  preferentially  varies  the 
boundary  conditions. 

The  total  errors  for  each  case  as  a  function  of  time  are  shown  in  figure  4.1. 
The  strong  two-day  periodicity  in  the  estimates  from  cases  B2,  C2,  F2  and  G2  is 
readily  apparent.  Also  apparent  is  the  increase  in  error  with  time  for  cases  C2  and 
D2,  as  the  flow  field  evolves  away  from  the  initial  conditions  (which  are  given  their 
true  values  at  the  start  of  the  optimization). 

The  only  difference  between  the  experimental  set-up  for  case  C2  and  case  D2 
is  that  in  case  D2  there  is  data  at  every  time-step  whereas  case  C2  has  data  at  every 
48th  time-step.  One  would  thus  expect  lower  error  measures  for  case  D2  as  there  is 
more  data  available.  From  the  error  measures  shown  in  table  4.1  and  figure  4.1,  it 
can  be  seen  that  the  only  significant  difference  between  the  two  cases  is  that  the  total 
streamfunction  error  for  case  D2  does  not  contain  the  two— day  variability  seen  in  case 
C2.  The  availability  of  the  extra  data  does  not  improve  the  assimilation  because:  (i) 
Most  of  the  variability  in  the  true  field  and  in  the  velocity  observations  is  at  periods 
greater  than  two  days  (see  figure  3.7),  so  sampling  the  true  field  every  hour  instead  of 
every  two  days  does  not  supply  more  information  in  the  optimization,  (ii)  The  extra 
data  points  in  case  D2  are  taken  at  the  same  locations  of  the  data  points  in  case  C2. 
The  temporal  and  spatial  coverage  of  the  data  is  effectively  the  same  in  both  cases, 
thus  their  error  measures  are  similar. 
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For  case  E2,  one  can  see  from  table  4.1  that  the  optimization  is  able  to  force 
the  flow  field  towards  its  true  values  throughout  the  sub-domain.  The  effect  of 
adding  noise  to  the  velocity  observations  slightly  increases  all  of  the  error  measures 
as  can  be  seen  by  comparing  case  F2  to  case  C2.  Using  an  objectively  mapped 
streamfunction  field  to  construct  the  first  guess  does  not  adversely  affect  the  error 
measures.  Comparing  case  G2  to  case  F2  in  table  4.1,  one  can  see  that  the  total  error 
measures  are  reduced  by  approximately  the  same  proportion  from  their  first  guess 
values. 


4.2  Sensitivity 

The  gradient  of  the  model  counterpart  of  a  datum  with  respect  to  the  estimated 
boundary  and  initial  conditions,  can  be  calculated  via  the  sensitivity  matrix.  This 
procedure  was  presented  in  section  2.5  for  the  estimated  initial  conditions.  Sensitivi¬ 
ties  for  individual  data  points  are  now  calculated  using  the  estimated  boundary  and 
initial  conditions  from  twin  experiment  C2. 

As  in  section  2.5  the  sensitivities  are  illustrated  for  two  representative  data 
points.  The  first  is  the  eastward  velocity  at  current  meter  3  (at  the  center  of  the 
array)  at  day  5  (time-step  120),  the  second  is  also  the  eastward  velocity  at  the  same 
location  but  fourteen  days  later  at  day  19  (time-step  456).  These  two  examples  are 
chosen  to  show  how  the  influence  of  the  model/data  misfit  on  the  control  variables 
changes,  from  early  times  to  later  times  in  the  assimilation  period. 

In  figures  4.2,  4.3  and  4.4  are  shown  the  sensitivities  of  the  first  data  point  to 
the  vorticity  initial  conditions,  the  vorticity  boundary  conditions  and  the  streamfunc¬ 
tion  boundary  conditions  respectively.  Figures  4.5,  4.6  and  4.7  are  the  same,  but  for 
the  second  data  point.  The  contour  interval  is  the  same  in  all  of  the  figures  in  this 
section.  In  these  figures  the  sensitivity  with  respect  to  the  scaled  control  variables  is 


107 


Figure  4.2:  Sensitivity  to  the  vorticity  initial  conditions,  of  the  eastward  velocity  recorded 
at  current  meter  3  (center)  at  day  5. 

displayed,  ie.  boundary  streamfunction,  boundary  vorticity  and  initial  vorticity  are 
all  of  order  one. 

Comparing  figures  4.2  and  4.3,  it  is  apparent  that  the  datum  at  day  5  is  more 
sensitive  to  the  vorticity  initial  conditions  than  to  the  vorticity  boundary  conditions, 
which  is  to  be  expected  early  on  in  the  assimilation  period.  The  velocity  at  current 
meter  3  at  five  days  is  of  the  order  of  10  cm/s,  both  in  the  true  flow  field  and  in  the 
estimated  flow  field  (see  the  velocity  time  series  in  figure  3.7)  which  suggests  that 
information  from  the  model/data  misfit  is  advected  approximately  two  grid  points  in 
five  days  in  the  adjoint  model.  One  can  see  from  figure  4.2  that  the  velocity  observa¬ 
tion  at  the  central  current  meter  is  most  sensitive  to  the  vorticity  initial  conditions 
about  two  grid  points  away  to  the  south-southwest.  This  is  in  agreement  with  the 
estimated  flow  field  at  t  =  0  shown  in  figure  3.8. 

Also,  one  can  see  (from  figure  3.8)  that  the  inflowing  velocities  at  the  southern 
end  of  the  western  boundary  at  early  times  are  quite  high.  The  majcimum  flow 
speed  in  the  jet  is  of  the  order  of  1  m/s.  In  five  days,  information  can  be  advected 
approximatedly  twenty  grid  points  by  the  flow  emanating  from  the  western  boundary, 


108 


NORTH 


EAST 


Figure  4.3:  Sensitivity  to  the  vorticity  boundary  conditions,  of  the  eastward  velocity 
recorded  at  current  meter  3  (center)  at  day  5.  The  abscissa  of  each  plot  is  time  marked  in 
days,  the  ordinate  of  each  plot  is  distance  along  the  boundary  marked  at  every  grid  point. 
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Figure  4.4:  Same  as  figure  4.3  but  for  the  streamfunction  boundary  conditions. 
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Figure  4.5:  Sensitivity  to  the  vorticity  initial  conditions,  of  the  eastward  velocity  recorded 
at  current  meter  3  (center)  at  day  19. 

which  is  greater  than  the  distance  from  the  observing  location  to  the  boundary  inflow. 
Hence  the  datum  at  day  5  displays  some  sensitivity  to  streamfunction  and  vorticity 
at  earlier  times  on  the  western  boundary. 

From  the  sensitivity  plots  for  the  second  data  point  at  19  days,  we  can  see  that 
this  datum  is  more  sensitive  to  the  boundary  streamfunction  and  boundary  vorticity 
than  to  the  initial  vorticity.  At  this  later  time,  the  flow  observed  by  the  central 
current  meter  is  almost  all  entering  the  model  domain  at  times  later  than  i  =  0. 

In  the  plots  of  the  sensitivity  of  the  data  to  the  estimated  streamfunction 
boundary  conditions  (figures  4.4  and  4.7),  there  are  peaks  in  sensitivity  along  all 
boundaries  at  the  observation  times  (the  observation  times  are  indicated  by  check 
marks  on  the  horizontal  axes).  The  instantaneous  sensitivity  of  the  velocity  at  an  in¬ 
terior  point  to  all  boundary  values  of  streamfunction,  is  explained  in  the  appendix  in 
terms  of  the  adjoint  model  equations  and  the  Lagrange  multipliers.  This  phenomenon 
can  also  be  understood  in  terms  of  the  forward  model.  If  the  streamfunction  is  per¬ 
turbed  anywhere  on  the  boundary  at  some  time,  the  perturbation  is  felt  throughout 
the  interior  streamfunction  field  at  the  same  time,  since  the  total  streamfunction  field 
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Figure  4.6:  Sensitivity  to  the  vorticity  boundary  conditions,  of  the  eastward  velocity 
recorded  at  current  meter  3  (center)  at  day  19.  The  abscissa  of  each  plot  is  time  marked  in 
days,  the  ordinate  of  each  plot  is  distance  along  the  boundary  marked  at  every  grid  point. 


must  still  satisfy  the  Poisson  equation  at  that  time.  This  point  is  further  discussed 
in  the  next  section. 

Both  data  points  presented  here  appear  to  be  more  sensitive  to  the  boundary 
streamfunction  at  earlier  times,  than  to  the  boundary  vorticity  at  earlier  times.  By 
varying  the  boundary  streamfunction  at  times  earlier  to  the  observation  time,  the 
entire  streamfunction  field  can  be  varied  at  earlier  times.  Thus  the  streamfunction 
that  is  observed  at  a  current  meter  can  readily  be  changed  by  varying  the  boundary 
streamfunction  at  the  observation  time  and  at  earlier  times.  Apart  from  the  instan¬ 
taneous  responses  to  the  boundary  streamfunction,  both  data  points  are  relatively 
insensitive  to  conditions  on  the  northern  and  eastern  boundaries. 

These  sensitivity  plots  for  case  C2  show  that  the  observations  produced  by  the 
estimated  ocean  are  most  sensitive  to  streamfunction  and  vorticity  on  the  southwest 
part  of  the  boundary  at  early  times.  The  model  counterparts  of  the  data  are  forced 
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Figure  4.7;  Same  as  figure  4.6  but  for  the  streamfunction  boundary  conditions. 


towards  the  true  data,  during  the  optimization,  by  varying  the  streamfunction  and 


vorticity  on  the  southwest  part  of  the  boundary,  where  the  jet  enters  the  model 


domain. 


4.3  The  spread  of  information 

The  communication  of  information,  from  the  boundary  conditions  to  the  model  coun¬ 
terparts  of  the  data  in  the  forward  model,  and  conversely  from  the  model/ data  misfit 
to  the  boundary  gradients  in  the  adjoint  model,  appears  to  occur  by  two  distinct 
processes  contained  in  the  dynamical  model. 

Information  can  be  advected.  The  conservation  of  vorticity  along  streamlines 
(albeit  with  some  dissipation)  allows  boundary  vorticity  at  inflow  points  to  influence 
the  interior  flow  field.  Examination  of  the  sensitivity  plots  in  figures  4.3  and  4.6, 
together  with  the  snapshots  of  estimated  streamfunction  and  vorticity,  figures  3.8 
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and  3.9  for  case  C2,  shows  that  advection  of  vorticity  along  streamlines  is  how  the 
south-west  boundary  vorticity  at  earlier  times  is  influencing  the  model  counterparts 
of  the  current  meter  data  at  later  times.  The  values  of  streamfunction  on  the  south¬ 
west  boundary  also  strongly  influence  the  current  meter  data,  as  can  be  seen  from 
figures  4.4  and  4.7.  The  boundary  streamfunction  sets  the  location  and  speed  of  the 
inflow  and  outflow  as  the  streamfunction  field  gives  the  advective  velocities.  The 
model  counterparts  of  the  data  are  thus  more  sensitive  to  the  boundary  values  of 
streamfunction  than  to  the  boundary  values  of  vorticity,  at  the  south-west  inflow 
points. 

Information  can  be  spread  instantaneously.  This  process,  which  has  been 
touched  upon  several  times  so  far  in  this  study,  is  a  consequence  of  the  ellipticity 
of  the  Poisson  equation  that  relates  streamfunction  to  vorticity. 

On  page  143  in  the  appendix  it  is  explained  how  observations  of  streamfunction^ 
cause  instantaneous  jumps  in  the  cost  function  gradient  for  the  boundary  streamfunc¬ 
tion  at  the  observation  times.  The  estimated  boundary  streamfunction  itself  then 
exhibits  these  jumps  at  the  observation  times.  The  gradient  for  boundary  vorticity 
does  not  respond  instantaneously  to  the  intermittent  observations,  so  the  estimated 
boundary  vorticity  is  smooth  in  time.  Moreover,  because  streamfunction  and  vor¬ 
ticity  at  the  boundary  grid-points  are  not  forced  to  obey  the  model  equations,  the 
spikes  in  the  boundary  streamfunction  are  not  smoothed  in  time  by  the  dynamics. 
This  phenomenon  is  clearly  seen  in  the  sensitivity  plots  shown  in  the  previous  section. 

In  section  3.3,  a  comparison  was  made  between  case  C2  and  case  D2.  The  only 
difference  between  these  two  experiments  is  that  C2  had  data  available  at  every  48th 
time-step  whereas  D2  had  data  available  at  every  time-step.  It  was  seen  that  the 
periodicity  of  the  data  sampling  appears  in  the  temporal  behavior  of  the  estimated 
boundary  streamfunction,  but  not  in  the  estimated  boundary  vorticity  (compare  fig- 

^The  velocity  observations  are  explicit  functions  of  the  streamfunction  at  neighboring  points,  see 
equation  (1.5) 
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ure  3.11  for  case  C2  with  figure  3.14  for  case  D2).  Note  that  because  the  boundary 
streamfunction  is  expanded  in  a  Fourier  series  with  a  minimum  period  of  two  days, 
the  spikes  at  the  observation  times  for  case  C2  become  a  wave  with  a  two  day  period. 
Hence  for  cases  B2  and  C2  (which  both  used  flat  scaling  and  two-day  data  sampling) 
there  is  a  strong  two-day  periodicity  in  the  estimated  boundary  streamfunction.  This 
is  apparent  in  the  residual  error  measures  for  streamfunction  plotted  in  figure  4.1. 

For  the  estimated  streamfunction  fields  from  case  C2,  the  two-day  periodicity 
is  strongest  at  the  boundaries  and  diminishes  as  one  moves  away  from  there  as  the 
model  equations  act  as  a  smoother  in  the  interior.  In  figure  4.8  time  series  of  true  and 
estimated  streamfunction  are  shown  at  four  grid-points  from  the  western  boundary 
to  the  centre  of  the  domain.  The  two-day  variability  is  strongest  at  the  boundary 
and  diminishes  for  grid-points  farther  from  the  boundary. 

It  appears  therefore  that  the  model  dynamics  is  performing  well  as  a  dynamical 
interpolater  between  observation  times  for  the  interior  streamfunction  field,  but  not 
so  for  the  boundary  streamfunction  as  the  equations  are  not  directly  evaluated  at 
boundary  points.  The  dynamics  produces  motions  with  periods  much  longer  than 
two  days,  so  the  interior  streamfunction  is  smoothed  between  observation  times.  The 
boundary  streamfunction  is  in  a  sense  decoupled  from  the  interior  dynamics,  so  the 
spikes  in  the  temporal  behavior  are  not  smoothed. 

4.4  Estimated  errors  in  the  boundary  conditions 

Using  the  sensitivity  matrix  method  of  section  2.3.2,  the  Hessian  of  the  cost  function 
with  respect  to  the  control  variables  is  calculated  and  inverted  to  produce  the  esti¬ 
mated  error  variance  of  the  control  variables  at  the  minimum,  and  hence  the  estimated 
error  in  the  initial  conditions  and  the  boundary  conditions. 

For  the  experiments  of  chapter  three  the  control  vector  has  5,521  elements, 
hence  the  Hessian  matrix  has  5,521  x  5,521  =  30,481,441  elements.  For  a  matrix 
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of  this  size  it  is  unfeasible  to  perform  the  same  tests  comparing  the  finite-difference 
Hessian  to  the  Hessian  calculated  using  the  sensitivity  matrix,  as  was  done  in  chapter 
two. 

Advantage  is  taken  of  the  fact  that  the  Hessian  produced  by  the  sensitivity 
matrix  method  is  by  definition  symmetric  and  positive— definite.  Only  the  upper 
triangular  part  of  the  Hessian  then  needs  to  be  calculated,  then  the  matrix  is  Cholesky 
factorized  and  inverted.  These  operations  were  performed  on  the  Cray  YMP  at 

NCAR. 

The  expression  for  the  estimated  relative  error  in  the  control  variables  was 
presented  in  section  2.4.  It  is  defined  as 

diagiH-^) 

|Po| 

where  is  the  inverse  of  the  Hessian  matrix,  and  Po  is  the  a  priori  covariance 
matrix  for  the  control  variables. 

Recall  that  the  control  variables  are  the  vorticity  initial  conditions  plus  the  ex¬ 
pansion  coefficients  from  equation  3.1  for  boundary  streamfunction  and  vorticity.  The 
expansion  coefficients  are  converted  back  to  time-series  by  a  linear  transformation. 
The  same  linear  transformation  is  used  to  convert  the  estimated  error  covariances  for 
the  expansion  coefficients,  into  error  covariance  time-series  for  boundary  streamfunc¬ 
tion  and  vorticity. 

Estimated  boundary  and  initial  conditions  whose  values  have  been  changed 
by  the  optimization  from  their  first  guess,  have  relative  errors  less  than  one.  The 
lower  the  relative  error  the  more  influence  an  initial  or  boundary  value  has  on  the 
model/data  misfit,  and  due  to  the  presence  of  data  it  is  known  with  more  certainty 
than  before. 

The  estimated  relative  errors  shown  here  are  for  the  estimated  control  variables 
from  case  C2.  In  figure  4.9  is  shown  the  estimated  relative  error  in  the  vorticity  initial 
conditions,  figure  4.10  shows  the  estimated  relative  error  in  the  vorticity  boundary 
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Figure  4.9:  Estimated  relative  error  in  the  vorticity  initial  conditions  for  case  C  from 
chapter  three.  This  figure  may  be  compared  to  figures  2.3  and  2.1. 

conditions,  and  figure  4.11  shows  the  estimated  relative  error  in  the  streamfunction 
boundary  conditions. 

The  first  guess  for  the  vorticity  initial  conditions  in  this  case  (C2)  is  the  true 
initial  conditions.  Consequently  the  optimization  does  not  change  the  vorticity  initial 
conditions  very  much  from  their  first  guess.  The  model/data  misfit  at  later  times 
is  not  as  sensitive  to  changes  in  these  initial  conditions,  as  it  is  to  changes  in  the 
boundary  conditions  at  later  times.  Hence  the  estimated  error  is  not  changed  much 
from  its  a  priori  value.  Only  in  the  immediate  vicinity  of  the  observing  array  does 
the  assimilation  improve  the  relative  error  in  the  vorticity  initial  conditions. 

The  vorticity  boundary  conditions  are  changed  from  their  first  guess  only  in  the 
southwest  corner,  as  was  seen  in  the  last  two  sections.  Hence  the  relative  error  is  low 
only  in  the  southwest  corner  at  early  times.  The  meander  event  in  the  observations  at 
later  times  is  produced  in  the  assimilation  by  changing  the  boundary  vorticity  in  the 
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Figure  4.11:  Estimated  relative  error  in  the  streamfunction  boundary  conditions  for  case 
C2  from  chapter  three. 
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southwest  corner  at  early  times  away  from  its  first  guess.  Elsewhere  on  the  boundary, 
vorticity  does  not  influence  the  model/data  misfit. 

From  the  sensitivity  calculations  we  saw  that  boundary  streamfunction  instan¬ 
taneously  affects  the  model/data  misfit  at  observation  times.  This  effect  dominates 
the  relative  error  maps  for  boundary  streamfunction.  The  relative  error  becomes  lower 
at  earlier  times  because  a  change  in  boundary  streamfunction  at  some  time  causes  the 
entire  interior  streamfunction  held  to  change  at  that  time,  thus  affecting  the  entire 
streamfunction  field  at  any  later  time.  Also  the  highly  advective  nature  of  the  flow 
means  that  it  takes  time  to  go  from  the  boundary  to  the  data  points.  Hence  the  ear¬ 
lier  boundary  streamfunction  influences  more  observations  than  the  later  boundary 
streamfunction,  so  the  estimated  error  is  lower  at  earlier  times. 

An  interesting  comparison  would  be  to  calculate  the  Hessian  for  case  D2,  to 
see  whether  continuous  data  sampling  improves  the  estimated  error.  For  practical 
reasons  this  is  beyond  the  scope  of  this  study  as  the  sensitivity  matrix  for  case  D2 
is  larger  (greater  number  of  elements)  than  the  Hessian,  due  to  the  much  greater 
number  of  data  points.  The  sensitivity  matrix  method  would  thus  be  an  inefficient 
way  of  calculating  the  Hessian  for  case  D2,  and  one  might  be  better  off  using  finite- 
differences  or  the  second  order  adjoint  method  of  Wang  et  al.  (1992).  To  further 
complicate  matters,  the  data  points  would  not  be  independent  of  each  other  if  the 
continuous  data  sampling  was  obtained  by  interpolation,  and  one  would  have  to 
include  off-diagonal  covariances  in  R. 

4.5  Estimated  errors  in  the  interior 

In  section  2.1  it  was  shown  that  errors  in  the  control  variables  can  be  transformed 
into  errors  in  the  model  counterparts  of  the  observations  by  the  sensitivity  matrix,  in 
its  role  as  a  Jacobian  matrix,  i.e. 

m(e)-m(e)  =  ^[9-0]. 
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where  6  is  the  optimally  estimated  control  vector  and  6  is  the  true  control  vector.  A 
similar  transformation  may  be  constructed  to  transform  errors  in  the  control  variables 
into  errors  in  the  interior  fields  of  streamfunction  and  vorticity, 

Then  the  expected  error  variance  of  the  interior  streamfunction  is  given  by 

<  (V'tw  -  4(  W  >  =  (^) 

80  [80] 

where  P  is  the  estimated  error  covariance  of  the  control  variables,  computed  from  the 
inverse  Hessian  using  the  sensitivity  matrix  method.  The  main  diagonal  elements  of 
P  were  presented  in  the  previous  section. 


struct  the  sensitivity  matrix  described  in  section  2.3.2.  That  is,  the  cost  function 
used  in  the  optimization,  J,  is  replaced  by  an  interior  streamfunction  (or  vorticity) 
value,  'ijjlj.  The  forward  and  adjoint  models  are  then  run,  starting  from  the  optimally 
estimated  control  variables,  to  produce  the  gradient  of  with  respect  to  0  at  the 
minimum  of  the  cost  function. 

To  calculate  the  expected  error  in  streamfunction  or  vorticity  at  one  interior 
grid-point  at  one  time  requires  one  iteration  of  the  forward  and  adjoint  models.  Thus 
an  error  map  over  all  of  the  interior  grid-points  at  one  time  requires  529  iterations, 
which  can  be  done  in  a  reasonable  amount  of  time.  Twice  this  number  of  iterations 


are  performed  because  the  above  expression  for  the  expected  error  variance  of 
needs  to  be  compared  to  the  a  priori  error  variance  of  rplj  given  by 

^p„ 


where  Po  is  the  a  priori  error  covariance  matrix  for  the  control  variables  (see  section 
1.2).  The  above  expression  is  thus  the  variance  in  ■0*^  due  to  the  a  priori  variance  in 
the  initial  and  boundary  conditions. 
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The  relative  error  defined  by 


is  shown  as  a  contour  plot  in  figure  4.12,  for  the  interior  streamfunction  (top)  and 
interior  vorticity  (bottom),  from  case  C2  at  day  16. 

As  one  would  expect  the  relative  error  for  the  interior  streamfunction  is  lowest 
in  the  vicinity  of  the  observing  array,  and  is  generally  lower  in  the  southern  part  of 
the  sub“domain  where  the  interior  field  is  more  sensitive  to  the  inflow  and  outflow 
at  the  boundaries.  The  relative  error  map  for  interior  vorticity  is  more  interesting. 
Comparing  this  map  to  the  estimated  vorticity  map  at  day  16  for  case  C2  shown  in 
figure  3.9,  one  can  see  that  there  is  a  region  of  large  relative  error  in  the  northern 
part  of  the  sub-domain  that  corresponds  closely  with  the  region  of  low  vorticity  seen 
in  the  estimated  vorticity  map.  The  tongues  of  positive  and  negative  vorticity  that 
cross  the  observing  array  in  a  northwest-southeast  direction  have  low  relative  error. 

At  day  16  the  meander  event  is  passing  through  the  observing  array  (see  the 
estimated  streamfunction  plots  in  figure  3.8).  This  event  is  the  strongest  signal  in 
the  available  data,  and  the  optimization  adjusts  the  boundary  conditions  so  as  to  re¬ 
construct  this  event  in  the  estimated  flow  field.  There  is  therefore  a  strong  sensitivity 
between  the  interior  streamfunction  and  vorticity  in  the  meander  at  day  16  and  the 
boundary  conditions  at  previous  times. 

In  sections  4.2  and  4.4  of  this  chapter  it  was  illustrated  that  the  boundary 
values  of  streamfunction  and  vorticity  most  sensitive  to  the  available  data  are  those  in 
the  southwest  corner  of  the  sub-domain  at  early  times.  Through  the  model  dynamics 
these  boundary  values  are  forced  by  the  available  data  to  vary  in  order  that  the 
interior  streamfunction  and  vorticity  at  day  16,  in  the  vicinity  of  the  observing  array, 
is  in  agreement  with  the  observed  velocities.  By  this  process  information  about  the 
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point  measurements  of  velocity  is  spread  through  parts  of  the  sub-domain,  reducing 
the  relative  error. 


124 


Chapter  5 


Conclusions 


Experiments  were  carried  out  in  this  dissertation  to  explore  the  estimation  of  open 
ocean  boundary  and  initial  conditions  from  simulated  sparse  data  in  the  presence 
of  strong  nonlinearity.  The  results  of  these  experiments  represent  a  significant  step 
towards  the  assimilation  of  SYNOP  East  data  into  an  open  ocean  model. 

There  are  two  aspects  to  judging  the  success  of  an  assimilation.  The  first  aspect 
concerns  the  comparison  of  a  posteriori  statistics  with  a  priori  statistics,  which  one 
can  do  with  real  data.  The  second  aspect  concerns  comparisons  of  the  estimated 
flow  field  with  the  true  flow  field,  which  can  only  be  made  in  the  context  of  twin 
experiments. 

For  the  first  type  of  judgement,  the  primarily  accepted  measure  of  success  is 
the  final  value  of  the  cost  function.  Successful  minimization  of  the  cost  function  is 
a  necessary  condition  for  the  model  to  be  judged  consistent  with  the  data.  In  this 
study  when  the  cost  function  is  minimized  to  order  one,  the  model/data  residuals  are 
of  the  same  order  as  the  a  priori  error  in  the  data.  The  sensitivity  matrix,  which 
can  be  calculated  when  the  assimilation  uses  real  data,  illustrates  by  which  pathways 
the  data  improves  the  estimation  of  the  control  variables.  A  qualitative  view  of  the 
dependence  of  an  observation  upon  the  control  variables  can  be  readily  seen  from 
the  columns  of  the  sensitivity  matrix.  Quantitative  error  estimates  for  the  control 
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variables  can  also  be  calculated  from  the  sensitivity  matrix  by  the  method  described 
in  chapter  two.  Most  importantly,  the  error  estimates  can  be  calculated  when  the 
assimilation  uses  real  data. 

The  second  way  of  judging  the  success  of  an  assimilation  is  by  comparison  with 
the  true  fields.  Here  this  has  been  done  qualitatively  by  displaying  the  true  stream- 
function  fields  alongside  the  estimated  streamfunction  fields.  In  the  cases  presented 
in  chapter  one  and  chapter  three,  this  means  of  comparison  enables  one  to  see  that 
the  estimated  flow  field  looks  quite  similar  to  the  true  flow  field  in  the  vicinity  of  the 
observing  array,  but  differs  as  one  moves  towards  the  boundaries  where  the  estimated 
flow  field  is  more  influenced  by  the  boundary  conditions.  Quantitative  measures  of 
the  residuals  between  the  estimated  and  true  fields  were  shown  in  section  4.1. 

Chapter  one  presented  the  model  and  the  data  used  in  all  twin  experiments 
performed  in  this  study.  These  first  experiments  were  an  attempt  to  avoid  the  esti¬ 
mation  of  the  open  boundary  conditions,  by  treating  them  as  known  and  diminishing 
their  influence  on  the  assimilation  by  the  inclusion  of  a  sponge  layer  around  the 
boundary.  None  of  these  experiments  were  successful  in  diminishing  the  effect  of  the 
prescribed  boundary  conditions.  However,  case  Al,  which  employed  thirteen  current 
meters,  was  able  to  adequately  minimize  the  cost  function.  Case  B1  and  Case  Cl, 
which  both  employed  five  current  meters  and  two  tomographic  reciprocal  travel  times, 
were  not  able  to  minimize  the  cost  function  adequately.  There  was  more  information 
available  in  the  data  in  case  Al,  as  the  observing  array  covered  the  jet  at  all  times. 
The  available  data  in  cases  B1  and  Cl  contained  relatively  less  information  about 
the  jet,  as  the  jet  appeared  in  the  observations  only  at  later  times  as  it  meandered 
through  the  array. 

For  all  of  cases  Al,  B1  and  Cl,  the  estimated  flow  fields  were  quite  different 
from  the  true  flow  fields  in  regions  away  from  the  array.  It  was  seen  that  the  model 
set-up  for  these  twin  experiments  was  inconsistent  with  the  strong  flow  velocities 
observed  at  the  array.  The  prescribed  boundary  conditions  in  the  model  cause  the 
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optimization  to  estimate  vorticity  initial  conditions  that  contain  the  strong  flow  within 
the  model  boundaries.  In  the  true  flow  fleld,  over  the  period  of  the  assimilation,  this 
strong  flow  originates  from  farther  away,  beyond  the  artificial  boundaries  of  the  model. 

Chapter  two  presented  the  estimation  of  errors  for  the  estimated  control  vari¬ 
ables.  It  was  shown  by  Gauss-Mar kov  theory,  that  the  estimated  error  covariance 
of  the  control  variables  is  given  by  the  inverse  Hessian  of  the  cost  function  at  the 
minimum,  as  long  as  a  nonlinear  term  could  be  neglected.  The  Hessian  was  calcu¬ 
lated  by  two  methods:  finite-differences  and  the  sensitivity  matrix.  The  calculations 
were  performed  for  an  experiment  which  had  strongly  nonlinear  dynamics,  and  for 
an  experiment  which  had  weakly  nonlinear  dynamics.  The  control  variables  for  both 
experiments  were  the  vorticity  initial  conditions. 

The  explicit  expression  for  the  Hessian  of  the  cost  function,  equation  (2.6)  on 
page  44,  contains  a  nonlinear  term,  —  d),  arising  from  the  nonlinearity 

of  the  model  counterparts  of  the  data,  m,  with  respect  to  the  control  variables,  x. 
From  the  theory,  it  was  determined  that  this  term  needs  to  be  ignored  in  order  for  the 
estimated  error  covariance  of  the  control  variables  to  be  given  by  the  inverse  Hessian. 
The  sensitivity  matrix  method  of  computing  the  error  covariance  matrix  ignores  this 
term  from  the  outset  and  the  Hessian  is  well-conditioned,  both  for  strongly  nonlinear 
dynamics  and  weakly  nonlinear  dynamics.  The  finite-difference  method  implicitly 
includes  the  nonlinear  term  in  its  calculation  of  the  Hessian.  Thus  the  inverse  of  the 
finite-difference  Hessian  gives  the  error  covariance  matrix  only  when  the  nonlinear 
term  is  negligible.  For  weakly  nonlinear  dynamics  this  was  found  to  be  the  case. 
The  Hessian  matrices  produced  by  both  methods  were  essentially  the  same,  and  both 
could  be  inverted  to  produce  the  relative  error  in  the  vorticity  initial  conditions  shown 
in  figure  2.3. 

In  the  presence  of  strong  nonlinearity,  the  nonlinear  term  may  be  significant 
enough  to  deteriorate  the  strong  positive-definiteness  contributed  by  the  other  terms 
in  equation  2.6,  rendering  the  Hessian  ill-conditioned.  In  fact  the  finite-difference 
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Hessian  was  found  to  be  ill-conditioned,  while  the  sensitivity  matrix  method  pro¬ 
duced  a  well-conditioned  Hessian  which  was  inverted  to  give  the  relative  error  in  the 
vorticity  initial  conditions  shown  in  figure  2.1.  Despite  the  ill— conditioning  of  the 
finite-difference  Hessian,  it  was  found  to  be  very  similar,  in  structure  and  size,  to  the 
Hessian  computed  by  the  sensitivity  matrix  method.  Thus,  although  the  nonlinear 
term  makes  a  small  contribution  to  the  finite-difference  Hessian,  it  is  big  enough  to 
render  the  Hessian  ill-conditioned. 

In  terms  of  the  behavior  of  the  cost  function  in  the  vicinity  of  its  estimated 
minimum,  the  sensitivity  matrix  method  gives  only  the  local  curvature  of  the  cost 
function  in  control  space,  under  the  assumption  that  this  curvature  is  quadratic.  If 
the  residuals  in  the  model/data  misfit  are  large  compared  to  the  a  priori  errors,  as 
was  the  case  for  the  twin  experiment  used  in  chapter  two,  the  nonlinear  term  will 
be  significant  when  ni(x)  is  nonlinear.  Thus,  if  the  model/data  residuals  are  small 
in  the  nonlinear  case,  the  finite-difference  Hessian  could  be  well-conditioned  even  in 
the  presence  of  strong  nonlinearity. 

Examination  of  the  off-diagonal  structure  of  the  estimated  error  covariance 
matrix  showed  that  the  dynamics  acted  to  reduce  the  length-scale  of  spatial  covari¬ 
ance  in  the  vorticity  initial  conditions.  The  a  priori  length-scale  set  by  the  Laplacian 
smoothing  term  is  larger  than  the  length-scale  for  vorticity  that  is  inherent  in  the 
model  dynamics. 

The  sensitivity  matrix  was  also  found  to  be  useful  in  illustrating  the  depen¬ 
dence  that  the  model  counterparts  of  the  data  have  to  the  vorticity  initial  conditions. 
For  the  weakly  nonlinear  case  a  velocity  observation  was  seen  to  be  most  sensitive 
to  those  vorticity  initial  conditions  in  the  immediate  vicinity  of  the  observing  loca¬ 
tion,  even  late  in  the  assimilation  period.  For  the  strongly  nonlinear  case,  a  velocity 
observation  taken  late  in  the  assimilation  was  seen  to  be  strongly  sensitive  to  the 
vorticity  initial  conditions  throughout  the  domain  because  the  flow  field  is  forced  to 
be  contained  by  the  boundaries. 
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The  results  of  the  experiments  in  chapter  one  and  chapter  two  lead  to  the 
conclusion  that  in  order  for  the  estimated  flow  field  to  resemble  the  true  flow  field, 
the  time-dependent  values  of  streamfunction  and  vorticity  on  the  artificial  boundaries 
of  the  model  domain  need  to  be  estimated  as  part  of  the  assimilation. 

Explicit  expressions  for  the  gradient  of  the  cost  function  with  respect  to  bound¬ 
ary  streamfunction  and  vorticity  were  formulated  (shown  in  the  appendix).  The  time- 
series  of  boundary  streamfunction  and  vorticity  were  expanded  in  Fourier  series  so  as 
to  reduce  the  number  of  control  variables.  Several  twin  experiments  were  performed 
that  illustrated  the  dependence  of  the  estimated  flow  field  upon  the  first  guess  used 
for  the  control  variables,  and  upon  the  presence  of  noise  in  the  observations  and  the 
first  guess. 

The  experimental  set-up  for  these  experiments  was  more  sophisticated  than 
those  in  the  first  chapters.  In  chapter  three  the  control  variables  were  the  sufficient 
set  of  initial  and  boundary  conditions  needed  to  integrate  the  barotropic  vorticity 
equation  forward  in  time  in  a  domain  with  open  boundaries  (Charney  et  al,  1950). 
The  only  a  priori  information  needed  was  a  first  guess  and  scaling  for  the  control 
variables,  that  is,  a  point  in  control-space  at  which  to  start  the  optimization  with 
relative  weights  for  each  control  variable.  The  first  guess  and  scaling  have  physical 
meaning  and  can  be  estimated  beforehand  for  real  data:  the  first  guess  is  the  expected 
values  of  the  initial  and  boundary  conditions,  and  the  scaling  is  the  expected  variance 
in  these  values. 

All  the  twin  experiments  presented  in  chapter  three  performed  well  in  the 
sense  that  the  cost  function  was  minimized  to  order  one  for  all  experiments.  The  best 
minimization  was  for  case  C2  where  the  cost  function  was  minimized  to  the  lowest 
value  relative  to  its  value  at  the  start  of  the  minimization.  For  this  case,  persistence 
in  time  of  the  true  initial  conditions  was  used  as  a  first  guess  for  the  control  variables, 
and  noise-free  data  was  available  every  two  days.  The  choice  of  a  time— invariant  state 
as  a  first  guess  is  how  one  could  proceed  if  using  real  data.  Often  there  is  a  regional 
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hydrographic  survey  at  the  start  of  a  mooring  time-series.  If  not,  one  could  use  a 
climatology  for  the  region  for  a  first  guess.  Case  G2  was  the  same  as  case  C2  but  with 
noisy  observations,  and  the  initial  conditons  were  taken  from  simulated  hydrographic 
survey  of  streamfunction  objectively  mapped  onto  the  model  grid.  This  experiment 
may  be  regarded  as  the  most  successful  as  it  is  the  one  that  is  most  applicable  to  the 
real  world. 

In  chapter  four  the  quality  of  the  assimilation  experiments  of  chapter  three  was 
assessed.  Root-mean-square  residuals  with  the  true  field  were  presented,  showing  the 
differing  ranges  of  spread  of  information  for  each  experiment.  The  computation  of  the 
sensitivity  matrix  showed  how  information  from  an  observation  was  spread  backwards 
in  time  by  the  adjoint  model,  and  showed  the  relative  influence  of  boundary  and  initial 
conditions  on  a  datum.  The  estimated  error  covariance  for  the  estimated  initial  and 
boundary  conditions  for  case  C2  was  calculated  using  the  sensitivity  matrix  method. 
Error  maps  for  the  interior  streamfunction  and  vorticity  fields  were  also  constructed. 
From  the  error  maps  for  the  boundary  and  interior  vorticity,  the  process  can  be  seen 
by  which  strong  signals  in  the  available  data  influence  the  boundary  conditions  and 
how  the  relative  error  is  reduced  in  regions  where  the  information  spreads. 

The  sampling  period  of  the  velocity  data  was  found  to  enter  into  the  tempo¬ 
ral  behavior  of  the  estimated  boundary  streamfunction.  This  behavior  was  found  to 
be  a  property  of  the  dynamics.  The  elliptic  nature  of  the  Poisson  equation  relating 
streamfunction  to  vorticity  means  that  a  variation  in  streamfunction  anywhere  in  the 
model  domain,  at  any  particular  time,  is  instantaneously  felt  throughout  the  domain. 
This  phenomenon  gives  rise  to  jumps  in  the  boundary  streamfunction  whenever  ve¬ 
locity  observations  are  available.  Because  the  dynamical  equations  do  not  apply  at 
the  boundary  points  of  the  domain,  the  jumps  in  boundary  streamfunction  are  not 
smoothed  in  time  by  the  dynamics. 

For  all  of  the  experiments  from  chapter  three,  the  optimization  was  able  to 
minimize  the  cost  function  by  varying  the  temporal  behavior  of  the  boundary  stream- 


130 


function  and  vorticity  away  from  their  first  guess  values.  The  vorticity  is  varied  in  the 
course  of  the  optimization  at  boundary  points,  where  it  is  advected  into  the  domain 
and  subsequently  advected  through  the  observing  array.  Boundary  streamfunction, 
on  the  other  hand,  is  varied  by  the  optimization  at  all  boundary  points  due  to  the 
elliptic  effect  that  has  been  previously  discussed.  Varying  boundary  streamfunction 
at  earlier  times  has  a  greater  effect  on  the  interior  flow  field. 

The  next  step,  before  model/data  assimilations  can  be  run  using  real  data, 
is  to  implement  vertical  variability  in  the  quasi-geostrophic  dynamical  model.  One 
could  readily  include  the  first  and  second  baroclinic  modes  in  the  dynamical  model 
used  here,  or  construct  a  two  or  three  layer  model  (see  Flierl,  1978). 

An  interesting  and  potentially  powerful  observation  from  the  SYNOP  East 
dataset  is  areally-averaged  vorticity.  This  can  be  obtained  by  integrating  tomographic 
reciprocal  travel  times  around  the  edges  of  the  polygon  defined  by  the  locations  of 
the  acoustic  transceivers  (see  Chester,  1993).  Using  the  adjoint  method  with  a  quasi- 
geostrophic  model,  a  vorticity  observation  would  provide  much  more  influence  on  the 
Lagrange  multipliers  than  the  streamfunction  observations  used  here  (see  appendix, 
also  Tziperman  &  Thacker,  1989).  Moreover,  the  forcing  in  the  adjoint  model  would 
be  at  every  point  within  the  areal  average.  Note  that  the  areally-averaged  vorticity 
data  would  not  add  more  information  to  the  assimilation  beyond  the  along-path 
velocities  given  by  the  acoustic  travel  times.  By  relating  a  velocity  observation  to 
vorticity  rather  than  streamfunction,  the  adjoint  method  may  converge  more  readily. 

The  twin  experiments  presented  here  to  estimate  optimal  boundary  conditions 
used  an  assimilation  period  of  only  24  days,  during  which  time  only  one  energetic 
event  passed  through  the  array.  The  estimated  boundary  conditions  were  varied  by 
the  optimization,  around  the  inflow  area  prescribed  by  the  first  guess,  so  as  to  best 
reconstruct  this  single  energetic  event.  It  would  be  interesting  if  a  longer  assimila¬ 
tion  period  were  used  during  which  there  were  several  energetic  features  observed. 
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The  information  from  the  observations  would  spread  further  in  different  directions, 
hopefully  to  give  better  boundary  estimates. 

In  the  original  setting  of  this  study  with  respect  to  the  SYNOP  dataset  and  the 
dynamics  of  the  Gulf  Stream,  there  were  two  issues  that  needed  to  be  investigated 
before  one  could  assimilate  real  data  into  a  dynamical  model.  These  issues  were 
(i)  the  effect  of  strongly  nonlinear  dynamics  on  a  model/data  assimilation  and  the 
desirability  of  making  error  estimates  in  the  presence  of  strong  nonlinearity,  (ii)  the 
treatment  of  unknown  boundary  conditions  for  a  regional  open-ocean  model. 

The  most  important  conclusion  from  the  experiments  performed  here  is  that  it 
is  possible  to  estimate  errors  in  the  presence  of  strong  nonlinearity,  without  linearizing 
the  dynamical  evolution  of  the  flow  field. 

Modellers  have  had  many  problems  with  open  boundaries,  such  as  radiation 
conditions  at  outflow  points  (Orlanski,  1976;  Chapman,  1985;  Blumberg  &  Kantha, 
1985).  These  problems  are  circumvented  in  the  formalism  of  the  adjoint  method.  At 
every  iteration  of  the  minimization  of  the  cost  function,  the  forward  model  dynamics 
are  exactly  obeyed  because  they  are  applied  as  a  strong  constraint  in  the  adjoint 
method.  The  control  variables  are  varied  in  the  course  of  the  minimization  so  as 
to  always  keep  the  strong  constraint  satisfied.  Hence  boundary  values  of  stream- 
function  and  vorticity  along  outflow  regions  of  the  boundary  are  consistent  with  the 
neighboring  interior  values. 

There  is  also  the  problem  with  characteristic  points  on  the  boundary  where 
inflow  changes  to  outflow  and  the  streamlines  run  parallel  to  the  boundary  (Ben¬ 
nett,  1992).  The  possibility  of  infinite  vorticity  gradients  at  characteristic  points  can 
be  removed  by  adding  a  penalty  term  for  gradients  of  vorticity  on  the  boundary 
(Bennett,  1992).  In  the  experiments  of  chapter  three  a  term  was  added  to  the  cost 
function  to  penalize  the  second  derivatives  of  streamfunction  and  vorticity  along  the 
boundary.  This  part  of  the  cost  function  may  be  preventing  large  vorticity  gradients 
from  forming  near  characteristic  points.  The  formal  ill— posedness  at  characteristic 
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points  is  a  rather  subtle  matter.  Further  study  is  needed  to  isolate  this  effect  from 
other  much  larger  sources  of  error  in  experiments  such  as  the  ones  presented  here. 

In  a  previous  study  (Seiler,  1993),  boundary  values  of  streamfunction  and 
vorticity  were  optimally  estimated  using  the  adjoint  method,  while  keeping  the  initial 
conditions  at  some  specified  values.  Here  the  full  set  of  initial  and  boundary  conditions 
are  estimated  from  the  data. 

For  most  of  the  twin  experiments  presented  here,  the  first  guess  values  for  the 
boundary  and  initial  conditions  were  chosen  based  upon  almost  total  ignorance.  The 
first  guess  values  for  the  control  variables  were  far  from  the  true  values  in  control- 
space.  For  all  of  the  experiments  except  case  E2,  it  was  seen  that  the  estimated 
boundary  conditions  were  no  closer  to  the  true  boundary  conditions  than  were  their 
first  guess  values.  However,  for  case  E2  we  saw  that  when  the  first  guess  values 
were  sufficiently  close  to  the  true  values,  the  optimal  estimates  were  closer  to  the 
true  values  than  they  were  to  their  first  guess  values.  Although  the  first  guess  for 
case  E2  was  obtained  by  randomly  perturbing  the  true  values  causing  the  boundary 
smoothness  cost  rather  than  the  model/data  cost  to  force  the  estimates  towards  the 
true  values,  the  results  indicate  that  when  the  first  guess  is  within  some  neighborhood 
of  the  true  values  the  estimates  will  be  forced  towards  the  true  values. 

For  the  assimilation  of  the  SYNOP  dataset,  the  true  boundary  conditions  are 
unknown.  One  could  follow  the  strategy  of  case  G2  and  objectively  map  a  large-scale 
time-dependent  dataset  for  the  region  (such  as  OTIS  for  the  Gulf  Stream  system), 
onto  the  grid— points  of  the  model  domain  and  onto  the  temporal  variation  at  the 
boundaries,  and  use  these  mapped  fields  as  a  first  guess  for  the  initial  and  boundary 
conditions.  The  methods  presented  in  this  study  could  then  be  used  to  improve  the 
estimated  values  of  the  initial  and  boundary  conditions,  the  improvement  arising  from 
the  assimilation  of  the  interior  observations  with  the  dynamical  constraints. 
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Appendix  A 


The  adjoint  method 

A*1  General  Form 

The  nonlinear  minimization  problem  and  a  method  of  solution  are  presented  here. 
As  in  the  other  chapters,  vectors  are  denoted  by  lower-case  boldface  characters  and 
matrices  are  denoted  by  upper-case  boldface  characters. 

Given  some  state  variables,  and  control  variables,  0,  that  are  related  by 
0)  =  0,  the  goal  is  to  find  the  control  variables  that  minimize  some  cost  function 
or  performance  criteria,  J[(f),6).  That  is,  minimize 

with  respect  to  6  while  keeping  the  constraint 

where  the  cost  function  and  the  constraint  can  be  nonlinear  functions.  The  funda¬ 
mental  difference  between  state  variables  and  control  variables  here  is  that  the  values 
of  control  variables  are  unknown  and  need  to  be  found,  whereas  the  state  variables 
are  determined  from  the  control  variables  via  the  constraint. 

When  the  system  of  equations,  f(0,0)  —  0,  is  treated  as  a  strong  constraint, 
then  Lagrange  multipliers  may  be  introduced  to  solve  this  problem.  An  augmented 
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cost  function  is  defined, 

L{<i),e,x)  =  j{ci>,e)  +  x^f{<l),9) 

where  the  Lagrange  multipliers,  A,  can  be  arbitrarily  chosen.  When  the  constraint  is 
satisfied,  L  =  J,  and  thus  also  the  first  variation,  dL  =  dJ,  where 


L{(j),e,X  +  d\)  -  L{(j),e,X) 

d(i>  ^  de  dx 

Usually  J  is  a  quadratic,  as  in  this  study,  and  has  a  minimum  where  dJ  =  0.  Thus 
minimizing  J  entails  finding  a  point  where  the  gradients  vanish.  Now  it  is  obvious 
that  ^  =  0,  from  the  constraint.  The  A  are  now  chosen  such  that  ^^dcf)  —  0,  ie. 

This  system  is  called  the  adjoint  equations,  from  which  the  A  can  be  determined. 


Thus, 


Q  r 

dJ  =  =  —dd 

dO 


or 

dJ  _  dL 

which  states  that  is  the  gradient  of  J  with  respect  to  6,  while  holding  f(^,  6)  =  0 
and  while  the  A  are  given  by  the  adjoint  equations. 

An  explicit  expression  for  the  gradient  is  determined  from  the  augmented 
cost  function.  A  gradient-search  algorithm  can  then  be  used  to  find  a  minimum  of 
the  cost  function  with  respect  to  the  control  variables.  This  can  be  quite  an  involved 
task  when  there  are  thousands  of  control  variables,  yet  using  a  conjugate  gradient 
routine  the  minimum  is  found  in  a  few  iterations  when  the  constraint  is  linear.  For 
nonlinear  constraints  there  may  be  multiple  local  minima,  and  the  minimum  found 
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by  gradient-search  may  depend  upon  where  the  search  was  started  from,  ie.  the  first 
choice  for  the  control  variables. 

In  this  study  the  cost  function  is  the  least-squares  misfit  between  the  data  and 
the  model  counterparts  of  the  data.  The  constraint  is  the  time  evolution  of  the  state 
variables  given  by  a  dynamical  model.  The  full  mathematical  formulation  is  quite 
cumbersome  involving  much  algebra.  Only  the  more  salient  features  of  the  formula¬ 
tion  are  presented  here.  Other  derivations  and  formulations  of  adjoint  methods  for 
ocean  models  may  be  found  in  Tziperman  &  Thacker  (1989),  Thacker  &  Long  (1988) 
and  Talagrand  &  Courtier  (1987). 

Making  the  problem  time-dependent,  the  constraint  is  the  time  evolution  equa¬ 
tion  for  the  state  variables,  and  the  control  variables  are  the  initial  conditions  for  the 
state  variables,  (pQ.  The  cost  function  is  the  least-squares  model/data  misfit,  equation 
(1.3).  The  problem  becomes:  Minimize 

T  _  V"  -  dkf 

-  2^  ^2 


with  respect  to  (p^  while 


for  alH  =  0, . . . ,  iV.  As  before  we  construct  the  augmented  cost  function 

N 

L{(po,  ■  ■  ■ ,  (pNi  -^1)  ■  ■  • )  ^n)  =  >7  +  ^  \  {(pt  ■“  f(‘/^t-i)]- 

t=i 


Taking  the  first  variation  we  get 

dL 


dL  = 


d(po 


N 

d(po  +  XI 

t=i 


dL 

d<Pt 


d(pt  -|- 


dL 


dXt 


and  dJ  =  dL  when  the  time-evolution  equation  for  the  state  variables  is  being 
satisfied.  As  before  =  0,  and  we  get  the  adjoint  model  equations  by  setting 


ah 

Wt 


d(p^  —  0,  for  all  t  =  1, . . . ,  iV,  ie. 


'  dJ 

.dcP, 


+  At 


'  df 
Mt 


w+i 
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where  A/z+i  =  0.  This  is  an  evolution  equation  for  the  Lagrange  multipliers  that  runs 
backward  in  time  forced  by  Note  that  this  adjoint  model  is  linear  in  Aj,  and 

uses  the  transpose  of  as  its  transition  matrix 
of  solution  is  called  the  adjoint  method). 

When  the  adjoint  model  and  the  forward  model  equations  are  satisfied, 


(which  is  why  this  whole  procedure 


dj  —  dL  —  d(j)Q 

OCpQ 

thus  the  gradient  of  the  cost  function  with  respect  to  the  control  variables  is 

dJ  _  dJ  T 

d(l>o  d(po  ^  d(f)o 

which  can  be  calculated,  then  used  to  find  the  minimum  of  J  with  respect  to  (^q. 

Given  the  gradient  of  J  with  respect  to  (f)^,  one  can  start  from  some  guess 
of  (pQ,  calculate  the  gradient  and  move  in  the  direction  of  the  minimum  of  J.  Then 
make  a  new  guess  for  <pQ,  get  the  new  gradient  and  iteratively  move  through  the 
vector  space  spanned  by  the  control  variables,  looking  for  the  minimum  of  J.  There 
are  many  gradient  search  routines  available  that  perform  this  procedure. 


A. 2  Discretized  Ocean  Model 

The  forward  model  equations,  (1.1),  are  appended  to  the  cost  function  with  Lagrange 
multipliers,  a\j,  and  to  make  an  augmented  Lagrangian, 


N-l  M-1  /  T 

i  =  ■’+'£'£  ii:  Vh  (vv*,-  -  C'i)  +  - >jI)1  + 

t=2  j=2  \t=0 


T 

t=l 


t-1 


0  •  -  C  “ 


^  +  J  (^.  0‘,r + /3  ( J'  +  V  vy 


(A.1) 


where  ^  is  the  nine-point  Arakawa  Jacobian  defined  by 


(V’j  Oi,j  =  [  (V’i-i.i  “  i’ij-i)  + 
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(V’i-ij-i  +  V'i-I.i  -  V't+i.i-i  -  V’t+i.i)  + 
(V’t.i-1  -  V’i+i.i)  Ci+i,j-i  + 

(V’i+lij-l  V’tJ-l  ~  V’t+l,j+l  ~  '0i,i+l)  Ct+l,i 
(V’i+i.i  -  V’i,j+i)  Ct'+i,i+i  + 

(V’i+i.i+i  +  V’i+i,i  “  V’t-i.i+i  “  V’t-i.i)  Ct,i+i  + 

(V’i,i+i  -  Ct-i,i+i  + 

(V'i-l,i+l  +  V’i,i+l  ~  “  V’i.j-l)  ] 


1 


12AxAy 


and  the  standard  five-point  Laplacian  is  used,  i  and  j  are  the  indices  for  the  east-west 
direction  and  the  north-south  direction  respectively,  t  is  the  index  for  the  time-step. 

The  forward  model  equations  that  are  appended  to  the  cost  function  with  La¬ 
grange  multipliers  in  equation  (A.l)  are  in  a  slightly  different  yet  equivalent  form  to 
the  forward  model  equations  (1.1)  on  page  19.  The  biharmonic  operator  acting  on 
vorticity  in  the  horizontal  friction  term  of  (1.1)  somewhat  complicates  the  discretiza¬ 
tion.  To  ease  the  discrete  formulation  of  the  forward  and  adjoint  model  equations, 
the  state  variable,  7/*^,  is  introduced.  It  is  defined  by 


■  =  v  ■■ 


The  boundary  conditions  for  rjl  j  need  to  be  specified  in  order  to  integrate  the  forward 
model  equations.  T]^j  is  set  to  zero  on  the  boundaries  for  all  time,  this  assumption 
about  the  Laplacian  of  vorticity  on  the  boundaries  does  not  affect  the  results  signifi¬ 
cantly  (Seiler,  1993). 

For  the  optimization  experiments  presented  in  chapter  three,  the  state  variables 

are: 

iplj,  Tjjj  at  interior  points  for  t  =  0, . . .  ,T 
Qj  at  interior  points  for  t  =  1,. . .  ,T. 

The  control  variables  are: 

V’fc,  Cb  ioT  t  =  0, . . .  ,T 

at  interior  points. 

where  the  subscript  b  denotes  the  grid  points  at  the  edges  of  the  model  domain. 
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Following  the  procedure  outlined  in  the  last  section,  the  first  variation  of  L 
(given  by  equation  A.l)  is  taken.  Setting  the  gradients  of  L  with  respect  to  the 
Lagrange  multipliers  to  zero  returns  the  forward  model  equations.  The  adjoint  model 
equations  are  obtained  by  setting  the  gradient  of  L  with  respect  to  the  state  variables 
to  zero,  ie. 

=  0  at  interior  points  for  t  =  1, . . . ,  T 


dL 

dL 


=  0  at  interior  points  for  t  =  0, . . .  ,T 
=  0  at  all  points  for  f  =  0, . . . ,  T 


After  much  algebra  these  equations  become  respectively 


Q*  ■  — 

»,j  t.j 


At 


dJ 


(A.2) 


These  three  equations  are  referred  to  as  the  adjoint  model  equations  in  this  study. 
The  first  is  a  prognostic  equation  for  a,  the  second  and  third  are  diagnostic  equations 
for  {i  and  v  respectively.  Together  with  the  forward  model  equations  they  are  called 
the  Euler-Lagrange  equations. 

The  appropriate  boundary  conditions  for  the  Lagrange  multipliers  in  the  ad¬ 
joint  model  equations  are  zero  at  all  boundary  points  for  all  time  and  zero  at  the 
final  time.  The  adjoint  model  equations  are  integrated  backward  in  time  forced  by 
■M-  at  interior  grid-points.  This  forcing  term  can  be  derived  by  differentiating  the 
cost  function  given  by  equation  (1.5)  on  page  23,  with  respect  to  ‘iplj.  This  adjoint 
forcing  term  is  non-zero  at  grid  points  and  times  where  the  model  counterpart  of  the 
data  is  a  functional  of  the  streamfunction.  Note  that  if  observations  of  vorticity,  or 
the  Laplacian  of  vorticity,  were  available  there  would  be  forcing  terms  due  to 
and  in  the  adjoint  model  equations.  This  not  the  case  in  any  of  the  experiments 
performed  in  this  study,  and  observations  of  such  high  spatial  derivatives  are  unlikely 
in  the  real  ocean. 
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The  complete  history  of  streamfunction  and  vorticity  in  space  and  time  from 
the  forward  model,  is  needed  to  integrate  the  adjoint  model  equations.  This  may 
present  a  problem  in  computing  memory  for  long  assimilation  periods.  Here  only  the 
streamfunction  history  needs  to  be  stored  from  the  forward  model,  as  vorticity  can 
be  readily  calculated  from  streamfunction  and  the  boundary  conditions  for  vorticity 
via  the  Poisson  equation. 

The  first  variation  of  L  now  only  contains  the  gradients  of  L  with  respect  to 
the  control  variables,  ie. 


N~\  M-1  f)j 

=  E  E  + 

iz=2  3^2 

T  [  N 

E  E 

t=0  U=1 
M-1  / 


dL  dL  dL  ,  dL  ' 

(  dL  dL  dL  r  It  ,  dL  ' 


+ 


(A.3) 


Explicit  expressions  for  the  derivatives  of  L  with  respect  to  the  control  variables  can 
be  obtained  by  differentiating  equation  (A.l).  After  some  algebra  we  get. 

Gradient  for  initial  vorticity  at  interior  points  (i  ^  1  or  N  ;  j^l  or  M): 

dL  _ 


At 


where 


0 

"  dC?j 


-  y^i.3 


OL. 


Ai 


since  is  not  defined  in  the  augmented  Lagrangian.  This  expression  for  the  gradi¬ 
ent  with  respect  to  the  initial  vorticity  conditions  is  that  obtained  by  Tziperman  & 
Thacker  (1989). 

Gradient  for  boundary  streamfunction  at  all  times  (for  example  along  the  western 
edge  of  the  model  domain,  i=l) 

dL  dJ  p 


^t+1  ^ 


dipi  j  2Ax  '  Ax^ 


[(Cl,i4-1  -  C2,i)  OC2,j+l  + 
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(Ci,i+i  +  C2,i+i  -  -  (2,7-1)02,7  + 

((2,7  -  (1,7-1)02,7-1] 


(A.4) 


12AxAi/ 

Gradient  for  boundary  vorticity  at  all  times  (for  example  on  the  western  edge  of  the 
model  domain,  i=l) 

dL  dJ 

_  I 
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Ax^ 
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[(l/'l,7+l  -  V’2,7)  02,7+1  + 

(V’1,7+1  +  V’2,7+1  ~  V'1,7-1  “  V’2,7-1)  02,7  + 


(A.5) 


The  adjoint  method  works  as  follows: 

(i)  A  first  guess  for  the  control  vector  is  selected. 

{ii)  Boundary  and  initial  conditions  are  taken  from  the  control  vector  and  used  to 
integrate  the  forward  model  equations  forward  in  time. 

(in)  At  observation  times  and  locations,  the  model  data  misfit  is  calculated,  and  at 
the  end  of  the  integration  period  the  value  of  the  cost  function  is  calculated. 

[iv)  The  cost  function,  the  model/data  misfits,  and  the  complete  forward  history  of 
streamfunction  are  stored. 

(u)  The  adjoint  model  equations  are  integrated  backward  in  time,  forced  by  the 
model/data  misfit  at  observing  times  and  locations. 

(u)  Along  the  way,  and  when  t  =  0  is  reached,  the  gradients  of  the  augmented  cost 
function  with  respect  to  boundary  and  initial  conditions  are  calculated  and  stored  in 
a  vector. 

{vi)  Values  of  cost  function,  gradient  vector  and  control  vector,  are  passed  to  the 
gradient-search  routine,  which  returns  a  new  control  vector. 

(vii)  If  the  norm  of  the  gradient  vector  is  below  a  certain  value  then  stop.  If  it  is  not 
then  go  to  (ii) 

The  above  sequence  of  steps  is  called  one  iteration  of  the  optimization  or 
minimization. 
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Following  are  a  few  comments  about  the  various  terms  in  the  adjoint  model 
equations  and  the  gradient  expressions: 

—  The  appearance  of  the  cost  function  J  in  (A. 4)  and  (A. 5)  is  not  due  to  there 
being  data  on  the  boundaries.  This  forcing  term  arises  from  the  terms  for  size  and 
smoothness  of  boundary  streamfunction  and  vorticity  in  the  cost  function. 

Terms  are  included  in  J  to  enforce  spatial  smoothness  of  streamfunction  and 
vorticity  along  the  boundaries. 

—  When  the  first  variation  of  the  augmented  cost  function  given  by  equation  (A.l) 
is  taken,  the  right-hand  side  is  integrated  by  parts  and  manipulated  into  the  form 
of  equation  (A. 3).  This  is  done  in  order  that  the  coefficients  of  dC^-,  etc.  can  be 
equated  with  the  gradients  of  L  in  equation  (A. 3).  There  are  boundary  terms  resulting 
from  the  integration  by  parts  that  are  due  to  the  spatial  derivatives  in  equation 
(A.l).  Thus  on  the  right-hand  side  of  the  expression  for  the  gradient  with  respect  to 
boundary  streamfunction,  equation  (A. 4),  the  second  term  is  from  the  term  in  the 
prognostic  equation  and  the  third  term  is  from  the  Laplacian  relating  streamfunction 
to  vorticity.  The  fourth  term  on  the  right  hand  side  of  equation  (A. 4)  is  from  the 
“edge”  of  the  Arakawa  Jacobian  along  the  western  boundary.  For  the  continuous 
integral  formulation  of  the  adjoint,  it  can  be  shown  that  as  Ax,  Ay  >  0  this  latter 
term  becomes  Oi^. 

The  interior  flow  field,  during  the  evolution  of  the  forward  model,  responds 
to  the  boundary  values  of  streamfunction  and  vorticity  through  the  various  spatial 
derivatives.  As  the  adjoint  model  equations  are  integrated  backward  in  time  the 
Lagrange  multipliers  advect  and  propagate  the  model/data  misfit  throughout  the  in¬ 
terior.  The  adjoint  forms  of  the  various  spatial  derivatives  near  the  boundaries,  as 
described  above,  allow  the  gradient  of  the  cost  with  respect  to  boundary  streamfunc¬ 
tion  and  vorticity  to  respond  to  interior  forcing. 

—  The  forward  model  equations,  that  are  used  as  a  strong  constraint  in  the  augmented 
cost  function  (equation  A.l),  only  hold  at  the  interior  grid-points.  The  disadvantage 


142 


of  this  is  that  streamfunction  and  vorticity  at  the  boundary  grid-points  are  not  re¬ 
quired  to  obey  any  dynamical  constraints.  Thus  the  gradients  of  the  augmented  cost 
function  with  respect  to  boundary  streamfunction  and  vorticity,  equations  (A,4)  and 
(A. 5),  only  contain  contributions  from  terms  in  the  forward  model  equations  that 
have  spatial  derivatives.  There  is  no  constraint  on  the  temporal  evolution  of  bound¬ 
ary  streamfunction  and  vorticity. 

Now  from  the  adjoint  model  equations  which  govern  the  behavior  of  the  La¬ 
grange  multipliers  in  the  interior,  equations  (A. 2),  one  can  see  that  the  Laplacian  of 
fi  is  proportional  to  the  forcing  term  from  the  model/data  misfit,  This  Poisson 

equation  needs  to  be  solved  for  fi  at  each  time-step  as  the  adjoint  model  equations  are 
integrated  backward  in  time.  Due  to  the  elliptic  character  of  the  Poisson  equation, 

the  field  of  of  /x  at  all  interior  grid  points  instantaneously  responds  to  any  changes  in 
dJ 

drPl.  • 

The  gradient  with  respect  to  boundary  streamfunction,  equation  (A. 4),  is  pro¬ 
portional  to  the  Lagrange  multiplier,  /i,  at  the  grid-points  adjacent  to  the  boundary 
points.  Thus  at  an  observation  time,  the  gradient  with  respect  to  boundary  stream- 
function  instantaneously  responds  at  all  boundary  points,  to  the  forcing  by 
These  instantaneous  responses  produce  spikes  at  observation  times,  in  the  gradient 
with  respect  to  boundary  streamfunction.  In  the  course  of  the  minimization  the  con¬ 
trol  vector  is  preferentially  varied  in  the  directions  of  these  high  gradients.  Hence  for 
data  that  is  sampled  sparsely  in  time,  the  values  of  streamfunction  all  around  the 
boundary  will  show  sensitivity  to  the  data  at  the  sampling  times. 

Because  there  are  no  temporal  constraints  on  the  boundary  streamfunction  or 
vorticity,  the  spikes  in  the  boundary  streamfunction  are  not  smoothed  at  all  by  the 
dynamics. 

The  gradient  with  respect  to  boundary  vorticity,  equation  (A. 5),  does  not  ex¬ 
hibit  this  type  of  behavior  as  it  is  not  proportional  to  interior  values  of  /x.  The 
instantaneous  domain-wide  changes  in  /x  appear  as  a  forcing  in  the  prognostic  equa- 


143 


tion  for  a.  The  field  of  a  is  therefore  smooth  in  time,  resulting  in  smooth  fields  of  v. 
Hence  the  gradient  with  respect  to  boundary  vorticity  is  smooth  in  time. 

—  The  particular  gradient -search  method  that  is  to  minimize  the  cost  function  is 
the  L-BFGS  method,  which  is  a  limited-memory  quasi-Newton  method.  The  reader 
is  referred  to  Zou  et  al.  (1993)  and  Wang  et  aL  (1995),  who  describe  the  L-BFGS 
method  in  detail.  For  this  method  the  iterative  search  for  the  minimum  stops  when 
the  norm  of  the  gradient  vector  falls  below  a  pre-determined  value.  In  this  study  the 
pre-determined  value  is  chosen  by  calculating  the  gradient  when  the  true  values  of 
the  control  variables  are  supplied  as  the  first  guess  for  the  minimization.  Gill  et  a/., 
(1981),  give  a  comprehensive  treatment  of  methods  of  solving  nonlinear  optimization 
problems. 

—  Navon  et  al  (1992)  indicate  that  the  large  scale  features  of  the  flow  field  are 
constructed  during  the  first  several  iterations  of  the  minimization  using  the  conjugate 
gradient  method  to  look  for  the  minimum  of  the  cost  function.  Later  iterations  make 
finer  adjustments  that  do  not  affect  the  flow  field  significantly. 
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